自动化学报2011,Vol.37Issue(9):1077-1084,8.DOI:10.3724/SP.J.1004.2011.01077
一种基于双层窗口的概念漂移数据流分类算法
A Double-window-based Classification Algorithm for Concept Drifting Data Streams
摘要
Abstract
Tracking concept drifts in data streams has recently become a hot topic in data mining. Most of the existing work is built on a single-window-based mechanism to detect concept drifts. Due to the inherent limitation of the single-window-based mechanism, it is a challenge to handle different types of drifts. Motivated by this, a new classification algorithm based on a double-window mechanism for handling various concept drifting data streams (DWCDS) is proposed in this paper. In terms of an ensemble classifier in random decision trees, a double-window-based mechanism is presented to detect concept drifts periodically, and the model is updated dynamically to adapt to concept drifts. Extensive studies on both synthetic and real-word data demonstrate that DWCDS could quickly and efficiently detect concept drifts from streaming data, and the performance on the robustness to noise and the accuracy of classification is also improved significantly.关键词
数据流/概念漂移/分类/随机决策树/滑动窗口Key words
Data stream/ concept drift/ classification/ random decision tree/ sliding widow引用本文复制引用
朱群,张玉红,胡学钢,李培培..一种基于双层窗口的概念漂移数据流分类算法[J].自动化学报,2011,37(9):1077-1084,8.基金项目
国家重点基础研究发展计划(973计划)(2009CB326203),国家自然科学基金(60975034),安徽省自然科学基金(090412044),合肥工业大学数据挖掘与智能计算研究中心"千人计划"团队人才培养专项基金(2010HGXJ0715)资助 (973计划)