|国家科技期刊平台
首页|期刊导航|数据与计算发展前沿|基于二重LOF与逆交叉验证的稳健AdaBoost回归模型

基于二重LOF与逆交叉验证的稳健AdaBoost回归模型OA

Robust AdaBoost Regression Model Based on Double LOF and Inverse-Cross-Validation

中文摘要英文摘要

[目的]传统AdaBoost回归模型的稳健性不足,改进的AdaBoost.RT+、AdaBoost.RS算法仍然存在对异常数据抑制效果不显著和识别正确率较低等问题,增强AdaBoost方法的稳健性具有重要的实际应用价值.[方法]给出的AdaBoost.R_LOF模型,首先提出二重LOF和逆交叉验证算法,并将两种方法结合,以概率刻画数据的异常程度.然后在AdaBoost.R2算法的基础上,根据数据的异常程度,对数据设置恰当的权重系数,在不影响正常数据迭代的同时抑制异常数据的影响.[结果]使得新模型具有更好的稳健性,并且得到更小的预测均方误差.[局限]该方法需要调节的超参数有所增加,需要根据数据集分布特征进行调整.[结论]模拟和真实案例结果显示,相比于AdaBoost.R2、AdaBoost.RT+和AdaBoost.RS算法,在不同比例异常值的数据集下,该方法都具有更好的稳健性和估计效果.

[Objective]The robustness of the traditional AdaBoost regression model is insufficient.The improved AdaBoost.RT+and AdaBoost.RS algorithms hold insignificant suppression on abnormal data and low identification accuracy of abnormal data.It is meaningful to enhance the robustness of AdaBoost algorithms.[Methods]First,dual LOF and inverse cross valida-tion algorithms are proposed,the abnormal degree of data is characterized by probability based on these two algorithms.Then,appropriate weight coefficients are given according to the abnormal degree of the data to suppress its influence and keep no effect on the normal data.[Results]This AdaBoost.R_LOF model holds better robustness and less mean squared error on prediction.[Limitations]However,more hyperparameters are needed.[Conclusions]Simulations and real appli-cations show that the new model has better robustness and estimation under the different proportions of outliers compared with AdaBoost.R2,AdaBoost.RT+and AdaBoost.RS algorithms.

曾凡倍;杨联强

安徽大学,大数据与统计学院,安徽合肥 230601安徽大学,人工智能学院,安徽合肥 230601

AdaBoost算法二重LOF算法逆交叉验证AdaBoost.R_LOF算法

oAdaBoostdouble LOFInverse-Cross-ValidationAdaBoost.R_LOF

《数据与计算发展前沿》 2024 (005)

126-138 / 13

安徽高校自然科学基金(KJ2021A0049);安徽省自然科学基金(2208085MA06)

10.11871/jfdc.issn.2096-742X.2024.05.012

评论