| 注册
首页|期刊导航|计算机技术与发展|朴素贝叶斯算法的MapReduce并行化分析与实现

朴素贝叶斯算法的MapReduce并行化分析与实现

张依杨 向阳 蒋锐权 张波 张君瑛

计算机技术与发展Issue(3):23-26,4.
计算机技术与发展Issue(3):23-26,4.DOI:10.3969/j.issn.1673-629X.2013.03.006

朴素贝叶斯算法的MapReduce并行化分析与实现

Analysis and Implementation of MapReduce Parallelization of Naïve Bayes Algorithm

张依杨 1向阳 1蒋锐权 2张波 3张君瑛4

作者信息

  • 1. 同济大学 计算机科学与技术系,上海 201804
  • 2. 上海证券交易所,上海 200120
  • 3. 上海师范大学 信息与机电工程学院,上海 200234
  • 4. 上海市陈家镇建设发展有限公司,上海 202162
  • 折叠

摘要

Abstract

Naïve Bayes is an efficient algorithm. Due to the limitation of memory and I/O resources,the efficiency of the algorithm has been greatly affected in mass data processing. In this paper,proposed a novel Naïve Bayes algorithm based on MapReduce programming model. Training set is cut apart before being processed. The core processing procedure is accomplished by MapReduce model. Extraction and parsing of the training set are processed in the Map function. Knowledge base of class and feature attributes are built in the Reduce function. In the experiments,mainly compare the performance of both the traditional algorithm and the improved parallel algorithm. The result of experiments shows that the parallel Naïve Bayes algorithm has good efficiency and high scalability in mass data processing.

关键词

朴素贝叶斯分类算法/并行计算/MapReduce

Key words

Naïve Bayes algorithm/parallel computing/MapReduce

分类

信息技术与安全科学

引用本文复制引用

张依杨,向阳,蒋锐权,张波,张君瑛..朴素贝叶斯算法的MapReduce并行化分析与实现[J].计算机技术与发展,2013,(3):23-26,4.

基金项目

国家自然科学基金资助项目(61103069,71170148) (61103069,71170148)

国家科技计划课题(2012BAD35B01) (2012BAD35B01)

上海市科技创新计划(11DZ 1501703) (11DZ 1501703)

上海信息化发展专项基金(20091015) (20091015)

上海市科技创新计划(陈家镇)(11DZ1210600) (陈家镇)

计算机技术与发展

OACSTPCD

1673-629X

访问量0
|
下载量0
段落导航相关论文