| 注册
首页|期刊导航|自动化学报|非平衡数据流在线主动学习方法

非平衡数据流在线主动学习方法

李艳红 任霖 王素格 李德玉

自动化学报2024,Vol.50Issue(7):1389-1401,13.
自动化学报2024,Vol.50Issue(7):1389-1401,13.DOI:10.16383/j.aas.c211246

非平衡数据流在线主动学习方法

Online Active Learning Method for Imbalanced Data Stream

李艳红 1任霖 1王素格 1李德玉1

作者信息

  • 1. 山西大学计算机与信息技术学院 太原 030006||山西大学计算智能与中文信息处理教育部重点实验室 太原 030006
  • 折叠

摘要

Abstract

Data stream classification is an important research task in the field of data stream mining,which aims to capture changing class structures from the ever-changing massive data.At present,almost no frameworks can sim-ultaneously address the common problems in data stream,such as multi-class imbalance,concept drift,outlier and the exorbitant costs associated with labeling the unlabeled samples.In this paper,we propose an online active learning method for imbalanced data stream(OALM-IDS).AdaBoost is an ensemble classification method that iteratively generates a strong classifier from multiple weak classifiers.AdaBoost.M2 further introduces the confid-ence degree of weak classifiers,which is suitable for static data.In the method,we firstly define an importance measure of training sample based on imbalanced ratio and adaptive forgetting factor,which makes the AdaBoost.M2 method applying for imbalanced data stream and improves the performance of ensemble classifier.Then,we propose an adaptive adjustment method of marginal threshold matrix,which optimizes the label request strategy.Finally,we define an adaptive forgetting factor based on the concept drift index by bringing the degree of concept drift into the construction process of model,which realizes the model reconstruction after drift.Comparat-ive experiments on six artificial data streams and four real data streams show that the classification performance of the online active learning method is better than those of the existing five learning methods for imbalance data stream.

关键词

主动学习/数据流分类/多类非平衡/概念漂移

Key words

Active learning/data stream classification/multi-class imbalance/concept drift

引用本文复制引用

李艳红,任霖,王素格,李德玉..非平衡数据流在线主动学习方法[J].自动化学报,2024,50(7):1389-1401,13.

基金项目

国家自然科学基金(62076158,62072294,41871286),山西省重点研发计划(201903D421041)资助Supported by National Natural Science Foundation of China(62076158,62072294,41871286)and Shanxi Key Research and Development Program(201903D421041) (62076158,62072294,41871286)

自动化学报

OA北大核心CSTPCD

0254-4156

访问量0
|
下载量0
段落导航相关论文