计算机工程与应用2012,Vol.48Issue(27):22-26,68,6.DOI:10.3778/j.issn.1002-8331.2012.27.005
基于MapReduce的Canopy-Kmeans改进算法
Improved Canopy-Kmeans algorithm based on MapReduce
摘要
Abstract
In order to solve the problem that how to void random Canopy selection of Canopy-Kmeans algorithm, this paper introduces an improved algorithm based on the minimum and maximum principle and realizes processing massive data based on MapReduce framework. Meanwhile, the algorithm is carried out in massive Internet news aggregation. The experiments show that the strategy of Canopy selection based on the minimum and maximum principle has higher classification accuracy and noise immunity compared to random strategy.关键词
Canopy-Kmeans算法/MapReduce/分布式聚类Key words
Canopy-Kmeans/ MapReduce/ distributed aggregation分类
信息技术与安全科学引用本文复制引用
毛典辉..基于MapReduce的Canopy-Kmeans改进算法[J].计算机工程与应用,2012,48(27):22-26,68,6.基金项目
国家自然科学基金(No.2009ZX05038-001) (No.2009ZX05038-001)
北京市属高等学校科学技术与研究生教育创新工程建设项目(No.PXM2012_014213_000037). (No.PXM2012_014213_000037)