计算机工程2011,Vol.37Issue(17):23-25,3.
二分K均值聚类算法优化及并行化研究
Research on Bisecting K-Means Clustering AlgorithmOptimization and Parallelism
摘要
Abstract
Considering the insufficiency of clustering speed which exists in the selecting the initial centroid of Bisecting K-Means(BKM) clustering algorithm, the idea of selecting the two patterns with distance maximum as the initial cluster centroid is implemented. An in-depth study and analysis is carried out on how to accelerate clustering in clustering system. According to the characteristics of BKM, the parallelism algorithm based on data parallelism and symmetric data-partition is put forward. Experimental results show that the improvement of algorithm gets ideal speedup performance and efficiency.关键词
数据挖掘/聚类算法/二分K均值/并行化/群集系统Key words
data mining/ clustering algorithm/ Bisecting K-Means(BKM)/ parallelism/ clustering system分类
信息技术与安全科学引用本文复制引用
张军伟,王念滨,黄少滨,蔺世明..二分K均值聚类算法优化及并行化研究[J].计算机工程,2011,37(17):23-25,3.基金项目
国家自然科学基金资助项目(60973028) (60973028)
国家科技支撑计划基金资助项目(2009BAH42B02) (2009BAH42B02)