计算机工程与应用Issue(16):47-54,8.DOI:10.3778/j.issn.1002-8331.1406-0183
一种基于Hadoop的高效K -Me doids并行算法
王永贵 1戴伟 1武超1
作者信息
- 1. 辽宁工程技术大学 软件工程学院,辽宁 葫芦岛 125105
- 折叠
摘要
Abstract
In view of the traditional K -Me doids algorithm is sensitive to the initial clustering center, slow convergence speed, and in large data environment facing the bottleneck problem of memory and CPU processing speed, through improving the initial center options and replacement strategy of using the Hadoop distributed computing platform combined with par-allel random sampling strategy based on Top K , realizes a highly efficient and stable K -Medoids parallel algorithm, and by adjusting the Hadoop platform, realize the further optimization of the algorithm. Experiments show that the improved K -Medoids algorithm not only has a good speedup, the convergence and the clustering accuracy are also improved.关键词
K -Me doids/分布式计算/Hadoop/并行采样Key words
K -Me doids/distributed computation/Hadoop/parallel sampling分类
信息技术与安全科学引用本文复制引用
王永贵,戴伟,武超..一种基于Hadoop的高效K -Me doids并行算法[J].计算机工程与应用,2015,(16):47-54,8.