| 注册
首页|期刊导航|陆军军医大学学报|利用可解释机器学习识别关键生物标志物预测代谢功能障碍相关脂肪性肝病患者的慢性肾病风险:基于NHANES数据库的研究

利用可解释机器学习识别关键生物标志物预测代谢功能障碍相关脂肪性肝病患者的慢性肾病风险:基于NHANES数据库的研究

徐燕 侯文青 唐毓馨 覃中毅 王涛 向俊宇 王斌 文良志 陈东风

陆军军医大学学报2026,Vol.48Issue(3):354-365,12.
陆军军医大学学报2026,Vol.48Issue(3):354-365,12.DOI:10.16016/j.2097-0927.202510032

利用可解释机器学习识别关键生物标志物预测代谢功能障碍相关脂肪性肝病患者的慢性肾病风险:基于NHANES数据库的研究

Explainable machine learning for identifying key biomarkers in predicting chronic kidney disease risk among patients with metabolic dysfunction-associated fatty liver disease:A study based on NHANES database

徐燕 1侯文青 1唐毓馨 1覃中毅 1王涛 1向俊宇 1王斌 1文良志 1陈东风1

作者信息

  • 1. 陆军军医大学(第三军医大学)大坪医院消化内科,消化系统肿瘤精准防治重庆市重点实验室,重庆
  • 折叠

摘要

Abstract

Objective To construct a machine learning-based prediction model for chronic kidney disease(CKD)in patients with metabolic dysfunction-associated steatotic liver disease(MASLD)and to develop a diagnostic nomogram for guiding clinical management.Methods This study was based on the National Health and Nutrition Examination Survey(NHANES)database(2007 to 2018).Patients with MASLD,defined as hepatic steatosis with at least one cardiometabolic risk factor,were selected as the analysis sample.After excluding patients with chronic renal failure and other conditions,a total of 2 144 MASLD patients were included,and 34 variables(including demographics,clinical characteristics,and laboratory indicators)were collected.Patients were stratified into a chronic kidney disease(CKD)group(n=347)and a non-CKD group(n=1 797)based on CKD status.The total participants were randomly divided into a training set(n=1 501)and an internal validation set(n=643)using stratified sampling at a 7∶3 ratio.A prospective external validation cohort comprising 110 eligible fatty liver disease patients was recruited from our department during January 2024 and October 2025,adhering strictly to predefined inclusion and exclusion criteria Variables were screened using Least Absolute Shrinkage and Selection Operator(Lasso)regression.Eight machine learning algorithms were then employed to construct the CKD risk prediction models:decision trees(DT),extreme gradient boosting(XGBoost),K-nearest neighbors(KNN),Naive Bayes(NB),support vector machine(SVM),single hidden layer neural networks(NNET),light gradient boosting machine(LightGBM),and random forest(RF).Model performance was comprehensively evaluated using receiver operating characteristic(ROC)curves area under the curve(AUC),sensitivity,specificity,calibration curves,and clinical decision curves.Based on key biomarkers identified through SHAP analysis,a diagnostic nomogram was constructed,and its diagnostic accuracy was assessed using ROC curves.Results Using interpretable machine learning algorithms,we identified 10 key biomarkers:SBP,age,diabetes,BUN,BMI,globulin,HDL-C,neutrophil count,uric acid,and gamma-glutamyl transferase.These biomarkers were utilized to construct predictive models through 8 distinct machine learning algorithms.Comprehensive evaluation revealed that the LightGBM model achieved optimal performance,with an AUC of 0.871,a sensitivity of 0.838,a specificity of 0.756,an accuracy of 0.825,an F1 score of 0.889,and a Brier score of 0.091.Calibration curve analysis demonstrated good consistency of the LightGBM-based predictive model across both training and validation sets.Clinical decision curve analysis indicated significant clinical utility of this model in predicting kidney damage risk in patients with MASLD.Based on SHAP values,we developed a diagnostic nomogram with a risk cutoff score of 81.7.This nomogram enabled categorization of patients into high-risk and low-risk subsets for MASLD-related CKD,and exhibited favorable accuracy and robust predictive reliability with an AUC value of 0.816.Conclusion The LightGBM model significantly outperforms other models and demonstrates potential for application in early clinical prediction of CKD risk among MASLD patients.

关键词

代谢功能障碍相关脂肪性肝病/慢性肾病/机器学习/疾病风险预测

Key words

metabolic dysfunction-associated steatotic liver disease/chronic kidney disease/machine learning/disease risk prediction

分类

医药卫生

引用本文复制引用

徐燕,侯文青,唐毓馨,覃中毅,王涛,向俊宇,王斌,文良志,陈东风..利用可解释机器学习识别关键生物标志物预测代谢功能障碍相关脂肪性肝病患者的慢性肾病风险:基于NHANES数据库的研究[J].陆军军医大学学报,2026,48(3):354-365,12.

基金项目

国家自然科学基金面上项目(82170594) Supported by the General Program of National Natural Science Foundation of China(82170594). (82170594)

陆军军医大学学报

2097-0927

访问量0
|
下载量0
段落导航相关论文