同济大学学报(自然科学版)2025,Vol.53Issue(6):976-984,9.DOI:10.11908/j.issn.0253-374x.23396
基于Stacking集成学习的乳腺癌智能诊断预测模型
Intelligent Diagnosis and Prediction Model of Breast Cancer Based on Stacking Ensembled Learning
摘要
Abstract
Integrating innovative data preprocessing methods and machine learning algorithms,an intelligent prediction model is constructed based on the Breast cancer Wisconsin diagnostic dataset.Firstly,the feature recursive elimination method based on light gradient boosting machine(LightGBM)model is used for feature selection.Secondly,the integrated sampling combined with adaptive synthetic sampling(ADASYN)oversampling and one-sided selection(OSS)undersampling is used to deal with data imbalance,and a balanced training data set is obtained.Finally,with multilayer perception(MLP),LightGBM and categorical boosting(CatBoost)as the base learner and logistic regression model as the meta-learner,an intelligent diagnosis model based on Stacking ensembled learning is constructed.It is evaluated by 5 folds cross-validation and classification prediction indicators such as accuracy,sensitivity,and area under receiver operating characteristic curve.The experimental results show that the proposed model can achieve a prediction accuracy of 98.2%,and has stable and excellent classification prediction performance,which can provide strong decision support for clinical diagnosis of breast cancer.关键词
智能诊断/分类预测/Stacking集成学习/机器学习/综合采样Key words
intelligent diagnosis/classification prediction/Stacking ensembled learning/machine learning/comprehensive sampling分类
经济学引用本文复制引用
段春艳,刘千拓,王佳洁,管迪,尤筱玥..基于Stacking集成学习的乳腺癌智能诊断预测模型[J].同济大学学报(自然科学版),2025,53(6):976-984,9.基金项目
国家自然科学基金(72171170) (72171170)
中央高校基本科研业务费专项资金(22120210535) (22120210535)