首页|期刊导航|现代医药卫生|基于机器学习构建妊娠地中海贫血智能诊断的算法模型

基于机器学习构建妊娠地中海贫血智能诊断的算法模型

张琴肖爽赵庆华

现代医药卫生2025，Vol.41Issue(10)：2353-2357,2361,6.

现代医药卫生2025，Vol.41Issue(10)：2353-2357,2361,6.DOI:10.3969/j.issn.1009-5519.2025.10.018

基于机器学习构建妊娠地中海贫血智能诊断的算法模型

Machine learning-based algorithmic model for intelligent diagnosis of gestational thalassemia

张琴 ¹肖爽 ²赵庆华³

作者信息

1. 重庆市渝北区第二人民医院护理部,重庆 401147
2. 重庆两江新区人民医院护理部,重庆 401121
3. 重庆医科大学附属第一医院护理部,重庆 400016
折叠

摘要

Abstract

Objective To construct a machine learning-based intelligent diagnostic model for gestational thalassemia and optimize screening strategies.Methods A retrospective cohort study was designed,collecting data from 4 715 pregnant women at People's Hospital of Chongqing Liangjiang New Area from January 2018 to December 2020,including 338 α-thalassemia cases(7.17%),286 β-thalassemia cases(6.07%),and 4 091 normal controls(86.76%).Data comprised complete blood count,blood type,and genetic test results.Key features were selected using LASSO regression,followed by stratified random sampling to split the dataset in-to training(n=3 772)and test sets(n=943)at an 8︰2 ratio.To address the issue of category imbalance,by combining SMOTE oversampling with cost-sensitive learning strategies,a thalassemia diagnosis model was constructed using six key indicators.The diagnostic efficiencies of seven classic machine learning methods,namely Extreme Gradient Boost(XGBoost),Decision Tree(DT),K-Nearest Neighbor(KNN),Linear Dis-criminant Analysis(LDA),Naive Bayes(NB),Random Forest(RF),and Support Vector Machine(SVM),were systematically compared.The area under the receiver operator characteristic(ROC)curve(AUC),F1-score,sensitivity and specificity were used for model evaluation.Results Feature selection identified red blood cells(β=-0.21),hemoglobin(β=0.28),hematocrit(β=-0.62),platelets(β=-0.48),mean platelet vol-ume(β=0.36),and platelet hematocrit(β=0.12)as key predictors.The AUCs of the seven machine learning algorithms were all greater than 0.88,but considering the comprehensive judgment of sensitivity,specificity,positive predictive value,negative predictive value and Youden index,the XGBoost model performed best and had the highest indicators.The RF model was second.The Youden indexes of the other five models were all less than 0.7.The XGBoost model performed best,with an AUC of 0.980(95%confidence interval 0.967-0.993),an F1-score of 0.938,a sensitivity of 89.3%,and a specificity of 94.0%,which were significantly bet-ter than traditional screening indicators(McNemar test P<0.05).Conclusion The XGBoost diagnostic mod-el based on the six parameters of complete blood count has good clinical applicability,and combines synthetic sampling and cost-sensitive learning strategies to effectively solve the problem of data imbalance.This model provides a high-precision and low-cost solution for prenatal screening of thalassemia.

关键词

机器学习/妊娠/地中海贫血/诊断/算法模型

Key words

Machine learning/Pregnancy/Thalassemia/Diagnosis/Algorithmic model

分类

医药卫生

引用本文复制引用

张琴,肖爽,赵庆华..基于机器学习构建妊娠地中海贫血智能诊断的算法模型[J].现代医药卫生,2025,41(10):2353-2357,2361,6.

基金项目

重庆市科卫联合医学科研项目(2022MSXM145、2023MSXM075). （2022MSXM145、2023MSXM075）

现代医药卫生

ISSN：1009-5519

访问量0

下载量0

段落导航