摘要
Abstract
Nowadays, the data stream classification research about concept drift has gained a lot of achievements. However, because of neglecting of the situation that concepts recur in the data steam, most of research methods will not only lead to high computation complexity and large memory overhead, but affect the classification accuracy. To solve this problem, based on the repeatability of concept, this paper proposes an ensemble classification algorithm for data stream, which applies ensemble classification theory to process the concept drift in data stream. On the one hand, the algorithm stores the essential information of temporary failure concepts and their corresponding base classifiers for later calls instead of deleting them during the learning process. On the other hand, it predicts the oncoming concept according to transitions between concepts. Therefore, the proposed algorithm can improve the classification accuracy and efficiency. Finally, the experimental results demonstrate the effectiveness of the new algorithm.关键词
数据挖掘/数据流/集成分类/概念漂移/重复性Key words
data mining/data stream/ensemble classification/concept drift/repeatability分类
信息技术与安全科学