计算机工程与应用Issue(6):124-128,5.DOI:10.3778/j.issn.1002-8331.1304-0457
针对标记数据不足的数据流分类器
Data stream classifier with limited labelled data
熊忠阳 1周兴勤 1张玉芳1
作者信息
- 1. 重庆大学 计算机学院,重庆 400030
- 折叠
摘要
Abstract
Most algorithms for data streams have addressed the problems of infinite length and concept drifting. However, These algorithms need all instances to be labelled by human experts and then they use them as training set to get a classifier. It is impractical in a high-speed data stream environment because labelling instances are both time consuming and costly. Then if just using supervised learning method to train a classifier, a small number of labelled instances will get a poor clas-sifier. This paper proposes a classification algorithm for data stream based on active learning. The method selects a small part of instances to be labelled, which have low confidence when classifying. Thus the number of instances needed to be labeled is greatly reduced. The experimental results show that the proposed method can use a small number of labelled data to classify the concept-drifting data streams correctly.关键词
数据流/分类/概念漂移/主动学习Key words
data streams/classification/concept drifting/active learning分类
信息技术与安全科学引用本文复制引用
熊忠阳,周兴勤,张玉芳..针对标记数据不足的数据流分类器[J].计算机工程与应用,2015,(6):124-128,5.