计算机与数字工程2025,Vol.53Issue(1):26-30,5.DOI:10.3969/j.issn.1672-9722.2025.01.006
改进不平衡数据中的集成学习算法
Improve Ensemble Learning Algorithms in Unbalanced Data
王璐 1程晓荣1
作者信息
- 1. 华北电力大学(保定)控制与计算机工程学院 保定 071000
- 折叠
摘要
Abstract
In recent years,there is a growing focus on machine learning research,and one of the key points in the field of ma-chine learning is ingestion learning.The basic principle of integrated learning is to use many independent classifiers and adopt a method to fuse them into a strong learner to overcome the shortcomings of single learner classification.Based on the comparison of four algorithms,which are Bagging algorithm,Random Forest algorithm,Weighted KNN(K-NearestNeighbor)algorithm and Ada-Boost algorithm,the weighted KNN algorithm and the AdaBoost algorithm are fused together.The dataset used is the dataset of the shopping behavior of network users.During the experiment,the unbalanced data is first processed using SMOTE sampling,and then the above four algorithms and the improved AdaBoost algorithm are evaluated and compared.Through comparison,it is found that the improved AdaBoost algorithm has better prediction performance.The improved AdaBoost algorithm is computed in parallel on the Spark platform to improve computing efficiency.关键词
集成学习/AdaBoost算法/SMOTE采样/加权KNN(K-NearestNeighbor)算法/不平衡数据/Spark平台Key words
ensemble learning/AdaBoost algorithm/SMOTE sampling/Weighted KNN(K-NearestNeighbor)algorithm/unbalanced data/Spark platform分类
计算机与自动化引用本文复制引用
王璐,程晓荣..改进不平衡数据中的集成学习算法[J].计算机与数字工程,2025,53(1):26-30,5.