|国家科技期刊平台
首页|期刊导航|护理研究|基于机器学习算法预测早期结直肠腺癌病人内镜治疗后的癌症特异性生存状态

基于机器学习算法预测早期结直肠腺癌病人内镜治疗后的癌症特异性生存状态OA北大核心CSTPCD

Prediction of cancer-specific survival status of patients with early colorectal adenocarcinoma after endoscopic therapy based on machine learning algorithms

中文摘要英文摘要

目的:基于机器学习算法构建早期结直肠腺癌病人经内镜治疗后的癌症特异性生存状态预测模型.方法:基于流行病学和最终结果数据库获取1 786例经内镜治疗后的早期结直肠腺癌病人资料,提取病人年龄、性别、种族、癌症原发部位、癌细胞分化程度、癌症组织病理学类型、放疗情况、化疗情况、肿瘤大小、病理情况、婚姻状况信息.经单因素Logistic回归与多因素Logistic回归分析确定早期结直肠腺癌病人内镜治疗后生存预后的独立影响因素.以8∶2的比例将病人分为训练集与测试集,将回归分析中有统计学意义的因素代入以机器学习算法构建的逻辑回归、随机森林、极限梯度提升、支持向量机、决策树、梯度提升决策树,基于最优机器学习模型对结果进行解释.结果:多因素Logistic回归结果显示,年龄、癌症原发部位、癌细胞分化程度、肿瘤大小、病理情况、婚姻状态是早期结直肠腺癌病人内镜治疗后生存预后的独立影响因素(P<0.05),随机森林在训练集与测试集中的受试者工作特征曲线下面积(AUC)分别为0.876和0.858,F1分数分别为0.791和0.739,基于随机森林的模型可解释性分析结果表明,年龄、婚姻状态、肿瘤大小重要性较高,年龄较高、肿瘤直径较大、分化程度较差及存在黏膜下浸润为死亡的危险因素,已婚为保护因素,癌症原发部位为右半结肠的病人相较于左半结肠病人生存状态较差.结论:机器学习构建的结直肠癌病人预后模型性能较好,可提供精准的个体化预测.

Objective:To construct a cancer-specific survival status prediction model for patients with early colorectal adenocarcinoma after endoscopic treatment used machine learning algorithms.Methods:Based on SEER database,the data of 1 786 patients with early colorectal adenocarcinoma after endoscopic treatment were obtained,and the information included age,sex,race,cancer primary site,degree of cancer cell differentiation,pathological type of cancer tissue,radiotherapy,chemotherapy,tumor size,pathological condition,and marital status were extracted.After univariate Logistic regression and multivariate Logistic regression analysis,independent influencing factors of survival prognosis of patients with early colorectal adenocarcinoma after endoscopic treatment were determined.The patients were divided into training set and test set at a ratio of 8∶2.the factors with statistical differences in regression analysis were substituted into Logistic regression,random forest,extreme gradient boosting,support vector machine,decision tree,gradient boosting decision tree which were constructed by machine learning algorithm.To interpret results based on optimal machine learning models.Results:The results of multivariate Logistic regression showed that age,cancer primary site,degree of cancer cell differentiation,tumor size,pathological condition,and marital status were independent influencing factors of survival prognosis of patients with early colorectal adenocarcinoma after endoscopic treatment(P<0.05).The area under the curve of receiver operator characteristic of random forest in the training set and test set were 0.876 and 0.858,respectively.And the F1 score were 0.791 and 0.739,respectively.The interpretability analysis of the model based on random forest showed that age,marital status and tumor size were more important,while higher age,larger tumor diameter,poor differentiation,and existence of submucosal infiltration were risk factors for death,and married were protective factors.Patients with the primary site of cancer in the right colon had poorer survival compared to those with the primary site of cancer in the left colon.Conclusions:The prognostic model constructed by machine learning for patients with colorectal cancer has good performance.It can provide accurate individualized prediction.

李志宏;蔡迎彬;王岩;樊华;伊丽米奴尔∙阿合买;李紫梅

新疆医科大学护理学院,新疆 830011新疆医科大学附属肿瘤医院新疆医科大学第三临床医学院

机器学习早期结直肠癌腺癌内镜治疗生存状态预测模型影响因素护理

machine learningearly colorectal canceradenocarcinomaendoscopic therapysurvival statepredictive modelsinfluencing factornursing

《护理研究》 2024 (014)

2459-2467 / 9

新疆维吾尔自治区自然科学基金资助项目,编号:2022D01C299

10.12102/j.issn.1009-6493.2024.14.003

评论