山西大学学报(自然科学版)2024,Vol.47Issue(6):1190-1200,11.DOI:10.13451/j.sxu.ns.2024119
一种面向能源贫困识别的轻量可解释梯度提升树
A Lightweight Interpretable Gradient Boosting Tree for Energy Poverty Identification
摘要
Abstract
In order to solve the problems of insufficient training,overfitting,and poor interpretability of the traditional gradient boost-ing tree method in identifying energy poverty,this paper designs a lightweight and interpretable gradient boosting tree for energy poverty identification.First,the noise samples such as missing values and outliers in the original data are eliminated,and the sample gradients after feature correlation analysis are sorted to realize the segmentation of internal nodes of the gradient boosting tree and to achieve the lightweight of the model.Then,the feature binding technology is used to accelerate the training process.Second,the model interpretation method is introduced to analyze the influencing factors to quantify the impact of different features on energy poverty identification,which enhances the interpretability of the model.Experimental results on a typical energy poverty identifica-tion dataset show that compared with other methods[LR(Logistic Regression),KNN(K-Nearest Neighbor),SVM(Support Vector Machine),RF(Random Forest),CART(Classification and Regression Tree),XGBoost(eXtreme Gradient Boosting),GradientBoost-ing],the lightweight interpretable model proposed in this paper achieves an AUC(Area Under Curve)value of 99.61%,showing an improvement of 0.2%to 17.8%,and thus shows a more obvious advantage.关键词
LightGBM(Light Gradient Boosting Machine)模型/能源贫困预测/特征关联分析/模型解释方法Key words
LightGBM model/energy poverty prediction/feature correlation analysis/model interpretation method分类
信息技术与安全科学引用本文复制引用
王政,裔扬,史颖,赵兴旺,吴晨旭..一种面向能源贫困识别的轻量可解释梯度提升树[J].山西大学学报(自然科学版),2024,47(6):1190-1200,11.基金项目
国家自然科学基金(92371116) (92371116)