计算机技术与发展2011,Vol.21Issue(1):23-26,4.
一种基于滑动窗口的流数据聚类算法
A New Streaming Data Cluster Algorithm Based on Sliding Window
摘要
Abstract
Data stream in the most recent distribution of the more often a cause for concern.CluStream algorithm is a traditional landmark -based model of the clustering algorithm which does not eliminate expired tuples.We cannot accurately reflect the current data distribution of the data stream.Sliding window is an approximate method which is concerned about the recent data in the data stream.In order to improve the quality and efficiency of the analysis of data stream clustering, have proposed an improved algorithm on the base of CluStream algorithm in this paper.Sliding window is used to support the data processing.In order to reduce the number of the calculation in the clustering operation, the algorithm use improved k-means clustering to perform the operation.The optimized algorithm can eliminate the expired tuples in time, while the new arrived tuples can be processed in real time.Through this way, can obtain a more accurate analysis result.Compared with clustering algorithm CluStream, optimization algorithm can obtain less memory overhead and faster dataprocessing capacity.So that, the outcome of clustering analysis can become much more reasonable and clear.关键词
流数据/聚类/滑动窗口/改进的k-means算法分类
信息技术与安全科学引用本文复制引用
蔡妮明,王翰虎,陈梅..一种基于滑动窗口的流数据聚类算法[J].计算机技术与发展,2011,21(1):23-26,4.基金项目
贵州省科技计划工业攻关基金资助项目(黔科合GY字[2008]3035) (黔科合GY字[2008]3035)