现代电子技术2016,Vol.39Issue(15):156-159,163,5.DOI:10.16652/j.issn.1004-373x.2016.15.040
一种基于动态调度的数据挖掘并行算法
A data mining parallel algorithm based on dynamic scheduling
摘要
Abstract
In comparison with other classification algorithms,the naive Bayes algorithm is suitable for parallel improve⁃ment. The single machine classification algorithm is unsuitable for the large⁃scale data processing,however the occurrence of cloud computing dynamic scheduling brings the hope for big data processing. The parallel computing improvement algorithm is used to enhance the classification speed. The algorithm was improved correspondingly in combination with the easily parallel characteristic of naive Bayes algorithm. The performance contrast experiments were performed on single machine serial mode, Hadoop,Spark and CUDA platform. It is verified that the paralleled classification algorithm has good speed⁃up ratio.关键词
数据挖掘/朴素贝叶斯/动态调度/并行计算Key words
data mining/naive Bayes/dynamic scheduling/parallel computing分类
信息技术与安全科学引用本文复制引用
朱亚琼,罗伟..一种基于动态调度的数据挖掘并行算法[J].现代电子技术,2016,39(15):156-159,163,5.基金项目
河南省教育厅高等学校重点科研项目基于SOA的高校数据集成平台研究与应用 ()