计算机应用研究Issue(1):115-117,3.DOI:10.3969/j.issn.1001-3695.2016.01.027
基于 MapReduce 的平均多项朴素贝叶斯文本分类
Average multinomial naive Bayesian text classification based on MapReduce
何敏 1武德安 1吴磊1
作者信息
- 1. 电子科技大学 数学科学学院,成都 611731
- 折叠
摘要
Abstract
This paper studied how to classify massive text data.Based on a distributed computing framework of MapReduce, implemented a simple and effective algorithm:a text classification method of average multinomial Naive Bayes.Experiments show that this method is superior to the general Bayesian method on the classification accuracy,has good performance on speedup as well.The results indicate that it is more suitable for massive text data classification due to reducing the impact of redundancy features information and good scalability of parallel computing.关键词
文本分类/朴素贝叶斯/并行计算/冗余特征/大数据Key words
text classification/naive Bayes/parallel computing/redundancy features/big data分类
信息技术与安全科学引用本文复制引用
何敏,武德安,吴磊..基于 MapReduce 的平均多项朴素贝叶斯文本分类[J].计算机应用研究,2016,(1):115-117,3.