解放军医学杂志2026,Vol.51Issue(3):354-362,9.DOI:10.11855/j.issn.0577-7402.1230.2025.1222
基于可解释机器学习的中青年男性骨量减少预测模型的建立
Development of a prediction model for osteopenia in young and middle-aged males based on explainable machine learning
摘要
Abstract
Objective To explore the risk factors associated with osteopenia in young and middle-aged males based on the health examination dataset,and to develop a predictive model and evaluate its performance.Methods A total of 989 healthy young and middle-aged male participants who underwent health examination at the 961st Hospital of the Joint Logistics Support Force between May 2022 and May 2024 were included in the retrospective study.The dataset was randomly divided into a training set(n=692)and a validation set(n=297)at a 7:3 ratio.The occurrence of osteopenia in participants was defined as the primary study endpoint.Independent risk factors were selected via LASSO regression.Six machine learning models,including extreme gradient boosting,support vector machine,multivariate logistic regression,K-nearest neighbors,light gradient boosting machine,and random forest,were employed to predict osteopenia in the study subjects.The optimal model was identified based on metrics including the area under the receiver operating characteristic curves(AUC),sensitivity,specificity,and the Brier score.The high-risk probability threshold was determined using the principle of maximizing the Youden index.Calibration and clinical utility of the best-performing model were assessed using calibration curves and decision curve analysis.Finally,the SHAP method was applied to interpret the predictions of the optimal model.Results Eight independent factors for osteopenia in young and middle-aged male participants were identified:smoking status,high-density lipoprotein cholesterol,triglyceride level,red blood cell count,regular exercise,serum albumin level,hemoglobin level,and uric acid level.Six machine learning models were constructed using different algorithms.Among them,the RF model demonstrated the best predictive performance,achieving the highest validation set AUC of 0.706(95%CI 0.644-0.769),specificity(0.884),positive predictive value(0.704),negative predictive value(0.708),and accuracy(0.704).It also yielded the optimal Brier score of 0.0301(0.0283-0.0322).The maximum Youden index was 0.384,corresponding to a sensitivity of 0.579 and a specificity of 0.805.The calibration curve for the validation set showed minimal deviation within the probability range of 0.20-0.65.The decision curve for the validation set indicated a positive net benefit within the risk threshold range of 0.12-0.65,supporting its potential utility in decision-making.Conclusion Smoking status,high-density lipoprotein cholesterol,triglyceride level,red blood cell count,regular exercise,serum albumin level,hemoglobin level,and uric acid level are independent influencing factors for osteopenia in young and middle-aged males.The prediction model constructed based on these factors demonstrates satisfactory predictive performance and can provide evidence-based decision support for clinical diagnosis and treatment.关键词
骨量减少/机器学习/SHAP方法/辅助决策支持Key words
osteopenia/machine learning/SHAP method/decision support分类
医药卫生引用本文复制引用
李开源,宋泽辉,于淼,杨振伟,侯丽雪..基于可解释机器学习的中青年男性骨量减少预测模型的建立[J].解放军医学杂志,2026,51(3):354-362,9.基金项目
This work was supported by the National Key Research and Development Program of China(2023YFF1203805),and the Qiqihar City Science and Technology Plan Innovation Incentive Project(CSFGG-2025029) 国家重点研发计划(2023YFF1203805) (2023YFF1203805)
齐齐哈尔市科技计划创新激励项目(CSFGG-2025029) (CSFGG-2025029)