重庆大学学报2025,Vol.48Issue(2):10-21,12.DOI:10.11835/j.issn.1000-582X.2025.02.002
基于机器学习的软件缺陷预测研究
Research on software defect prediction based on machine learning
摘要
Abstract
With the gradual penetration of machine learning technology into various fields,software testing in the software development process is very important.Software defect prediction faces class imbalance problem and accuracy issue.This paper proposes a supervised learning-based software prediction method for solving these two core problems.The method adopts sample balancing technique,combined with synthetic minority over-sampling technique(SMOTE)and edited nearest neighbor(ENN)algorithm,to test local weight learning(LWL),J48,C4.8,random forest,Bayes net(BN),multilayer feedforward neural network(MFNN),supported vector machine(SVM),and naive Bayes key(NB-K).These algorithms are applied to three different datasets(KK1,KK3 and PK2)in the NASA database and their effects are compared and analyzed in detail.The results show that the random forest model combining SMOTE and ENN exhibits high efficiency and avoiding overfitting in dealing with class imbalance problems,which provides an effective way to solve the problem in software defect prediction.关键词
软件缺陷预测/机器学习/类不平衡/XGBoost/随机森林Key words
software defect prediction/machine learning/class imbalance/XGBoost/random forest分类
信息技术与安全科学引用本文复制引用
喻皓,张莹,李倩,姜立标,尚云鹏..基于机器学习的软件缺陷预测研究[J].重庆大学学报,2025,48(2):10-21,12.基金项目
国家自然科学基金(61602345).Supported by National Natural Science Foundation of China(61602345). (61602345)