赵静 1黄晓媚 2杨俊 3罗振东 4曾伟雄 1张妮 1秦耿耿 1文戈1
作者信息
- 1. 南方医科大学南方医院 影像诊断科,广东 广州 510515
- 2. 南方医科大学南方医院 医学影像教研室,广东 广州 510515
- 3. 南方医科大学第十附属医院放射科,广东 东莞 523050
- 4. 香港大学深圳医院放射科,广东 深圳 518000
- 折叠
摘要
Abstract
Objective To investigate the construction and predictive performance of machine learning models based on clinical and CT imaging features for predicting pathological subtypes of thymic epithelial tumors(TETs).Methods This retrospective study included data from 221 patients with pathologically confirmed TETs at Nanfang Hospital,Southern Medical University,between January 2006 and June 2023.The data collected included clinical information,CT images,and pathological results.According to simplified pathological classification,the patients were classified into low-risk group(type A,AB,B1)and high-risk group(type B2,B3,thymic carcinoma).The included cases were randomly divided into the training set(n=159)and the validation set(n=62)at a ratio of 7:3.In the training set,univariate logistic regression was used to analyze the differences of clinical and CT characteristics between low-risk group and high-risk group.Feature selection was performed using stepwise regression and LASSO regression to construct four machine learning models,that were logistic regression,random forest,decision tree,and support vector machine.Model performance was evaluated in the validation set by AUC.Results Among 221 cases of thymic epithelial tumors,105 cases were low-risk type(74 in training set,31 in validation set)and 116 cases were high-risk type(85 in training set,31 in validation set).The results of univariate analysis showed that there were significant differences in sex and chest pain between high-risk group and low-risk group(P<0.05).Three CT features(tumor enhancement,pericardial or great vessel invasion,and pleural invasion)were selected using stepwise regression to construct a multivariate logistic regression model.Eight clinical and CT imaging features were selected through LASSO regression analysis for constructing random forest,decision tree,and support vector machine models.The AUCs for the models in the training set were 0.793,0.854,0.761,and 0.816,and in the validation set,they were 0.819,0.742,0.710,and 0.811,respectively.These results indicate that the logistic regression model has better generalization performance than the other three models.Conclusion The logistic regression model based on CT imaging features shows good diagnostic performance in predicting pathological subtypes of TETs and has potential for assisting clinicians in the early non-invasive identification of high-risk thymic tumors and thymic cancer.关键词
胸腺上皮性肿瘤/影像学特征/临床特征/预测模型Key words
thymic epithelial tumors/imaging features/clinical features/prediction model