计算机工程与应用Issue(8):6-8,47,4.DOI:10.3778/j.issn.1002-8331.1211-0090
基于半监督近邻传播的数据流聚类算法
摘要
Abstract
In order to improve the clustering quality of evolving data stream, this paper introduces a new data stream clustering algorithm, clustering over data Stream based on Semi-supervised Affinity Propagation(SAPStream), this algorithm calculates the similarity matrix of the initial data with the idea of semi-supervised, executes AP cluster, and then builds online clustering model. With the evolution of the data stream, the clustering model adjusts using decay windows technology, and the data stream clustering results are got by executing cluster again over the exemplars and new arrival data points. SAPStream can analyze and deal with large-scale evolving data stream. Its performance is tested by using both real datasets and synthetic datasets. Experi-mental results show this algorithm achieves a higher quality of clustering.关键词
数据流/半监督/近邻传播聚类/衰减窗口Key words
data stream/semi-supervised/affinity propagation clustering/decay windows分类
信息技术与安全科学引用本文复制引用
王文帅,陈刚..基于半监督近邻传播的数据流聚类算法[J].计算机工程与应用,2013,(8):6-8,47,4.基金项目
国家自然科学基金资助重点项目(No.90912004) (No.90912004)