计算机技术与发展2025,Vol.35Issue(3):1-8,8.DOI:10.20165/j.cnki.ISSN1673-629X.2024.0346
基于多层次集成学习的流特征在线稳定选择算法
Online Stable Streaming Feature Selection Algorithm Using Multi-level Ensemble Learning
摘要
Abstract
Feature selection is an essential part of the preprocessing phase of data mining,aiming to select the most relevant subset of features from the original data set.Traditional feature selection methods assume that the data set is static and unchanging.However,in real applications,data may be dynamically generated and processed.For this reason,online streaming feature selection methods emerged that generate features one by one in a streaming manner.Currently,most of the online stream feature selection methods proposed by re-searchers mainly focus on scalability,high accuracy,and low time overhead while ignoring the algorithm's stability.Stable feature selection results can effectively enhance users'credibility in the algorithm and make it of practical value.Aiming at the stability problem of the online feature selection algorithm,a new online stable stream feature selection algorithm framework(Multi-level Ensemble Learning Stream Feature Selection,MESFS)is proposed based on the multi-level ensemble learning strategy.Specifically,at the data set level,Extreme Learning Machine(ELM)is used to group and map samples to improve the accuracy of the algorithm.At the feature selection level,multiple iterations and adaptive threshold adjustment strategies are used to calculate the weight of features and selection to reduce the volatility and randomness of feature selection results.Four traditional static feature selection algorithms and five advanced online flow feature selection algorithms were selected,and many experimental comparisons were conducted on public data sets such as UCI,ARFF,and NIPS.The results show that the proposed method can perform excellently under training data disturbance—the balance between prediction accuracy and stability.关键词
特征选择/流特征/稳定性/集成学习/极限学习机Key words
feature selection/streaming feature/stability/ensemble learning/extreme learning machine分类
信息技术与安全科学引用本文复制引用
王琦,周鹏,张燕平..基于多层次集成学习的流特征在线稳定选择算法[J].计算机技术与发展,2025,35(3):1-8,8.基金项目
国家自然科学基金面上项目(62376001) (62376001)
安徽省自然科学基金面上项目(2308085MF215) (2308085MF215)