基于机器学习的血脂新特征构建及其在冠状动脉粥样硬化中的应用OA
Construction of New Blood Lipid Features Based on Machine Learning and its Application in Coronary Atherosclerosis
目的 运用机器学习模型分析血脂谱,构建血脂新特征,找到可以整合血脂谱的方法.方法 筛选我院2021年6月-2022年6月收治的冠状动脉粥样硬化患者68例,收集患者血脂谱中载脂蛋白B(ApoB)、非高密度脂蛋白胆固醇(N-HDL-C)、低密度脂蛋白胆固醇(LDL-C)、高密度脂蛋白胆固醇(HDL-C)、总胆固醇(TC)、甘油三酯(TG)、脂蛋白(a)Lp(a)数据,查看患者冠脉造影结果,采用改良的Gensini积分计算患者Gensini积分.根据血脂谱中各成分的关系,构建可解释的新特征-胆固醇指数.将患者随机分为训练集、测试集(3∶1),运用随机森林模型,通过观察曲线下面积(AUC)、f1值、精准度、召回率、准确率来验证构建的胆固醇指数对严重冠状动脉粥样硬化的预测价值.结果 共收集68例冠状动脉粥样硬化患者,其中男48例,女20例,平均年龄(57.96±11.33)岁.训练集和测试集的年龄、TC、ApoB、N-HDL-C、LDL-C、HDL-C、TG、Lp(a)及胆固醇指数比较,差异无统计学意义(P>0.05).采用原始血脂谱,应用随机森林模型预测严重冠状动脉粥样硬化的AUC为0.64(95%CI:0.41~0.80).纳入构建的新特征胆固醇指数=√ApoB×(LDL-C+0.1 ×(N-HDL-C-LDL-C))/HDL-C可很大程度的提高随机森林模型的预测效果,其AUC为0.84(95%CI:0.57~0.93),且其f1值、精准度、召回率、准确率都有不同程度的提升,分别为0.83、1.00、0.71、0.88.结论 胆固醇指数可有效的整合胆固醇数据,提升随机森林模型对严重冠状动脉粥样硬化的预测效果.
Objective To analyze lipid profile and find a method that can integrate lipid profile using machine learning.Methods A total of 68 patients with coronary atherosclerosis admitted to our hospital from June 2021 to June 2022 were screened.Apolipoprotein B(ApoB),non-high-density lipoprotein cholesterol(N-HDL-C),low-density lipoprotein cholesterol(LDL-C),high-density lipoprotein cholesterol(HDL-C),total cholesterol(TC),triglyceride(TG),lipoprotein(a)Lp(a)data in the blood lipid profile of the patients were collected.The results of coronary angiography were reviewed,and the Gensini score of the patients was calculated by modified Gensini score.According to the relationship between the components in the blood lipid spectrum,an interpretable new feature-cholesterol index was constructed.The patients were randomly divided into training set and test set(3∶1).The random forest model was used to verify the predictive value of the constructed cholesterol index for severe coronary atherosclerosis by observing the area under the curve(AUC),f1 value,accuracy,recall rate and accuracy rate.Results A total of 68 patients with coronary atherosclerosis were collected,including 48 males and 20 females,with an average age of(57.96±11.33)years.There was no significant difference in age,TC,ApoB,N-HDL-C,LDL-C,HDL-C,TG,Lp(a)and cholesterol index between the training set and the test set(P>0.05).Using the original lipid profile,the AUC of the random forest model for predicting severe coronary atherosclerosis was 0.64(95%CL 0.41-0.80).The prediction effect of the random forest model was greatly improved using new feature cholesterol index=√ApoB×(LDL-C+0.1×(N-HDL-C-LDL-C))/HDL-C,and its AUC value was 0.84(95%CI:0.57-0.93),and f1 value,accuracy,recall rate,and accuracy are improved to varying degrees,which were 0.83,1.00,0.71,and 0.88,respectively.Conclusion Cholesterol index can effectively integrate cholesterol data and improve the prediction effect of random forest model on the severity of coronary atherosclerosis.
左雨露;吴宇;杨锦鹏;赵梦梦
惠州市中大惠亚医院心血管内科,广东 惠州 516081
临床医学
机器学习随机森林模型血脂冠状动脉粥样硬化
Machine learningRandom forest modelLipidsCoronary atherosclerosis
《医学信息》 2024 (001)
29-34 / 6
评论