计算机技术与发展2016,Vol.26Issue(7):1-5,5.DOI:10.3969/j.issn.1673-629X.2016.07.001
基于Hadoop的关联规则挖掘算法研究--以Apriori算法为例
Research on Association Rules Mining Algorithm Based on Hadoop-Taking Apriori as an Example
摘要
Abstract
In order to solve the problem that the traditional association rules mining algorithm has been unable to meet the mining needs of large amount of data in the aspect of efficiency and scalability,take Apriori as an example,the algorithm is realized in the parallelization based on Hadoop framework and MapReduce model. On the basis,it is improved using the transaction reduce method for further enhance-ment of the algorithm's mining efficiency. The experiment,which consists of verification of parallel mining results,comparison on effi-ciency between serials and parallel,variable relationship between mining time and node number and between mining time and data a-mounts,is carried out in the mining results and efficiency by Hadoop clustering. Experiments show that the paralleled Apriori algorithm implemented is able to accurately mine frequent item sets,with a better performance and scalability. It can be better to meet the require-ments of big data mining and efficiently mine frequent item sets and association rules from large dataset.关键词
数据挖掘/关联规则/Hadoop/AprioriKey words
data mining/association rules/Hadoop/Apriori分类
信息技术与安全科学引用本文复制引用
刘木林,朱庆华..基于Hadoop的关联规则挖掘算法研究--以Apriori算法为例[J].计算机技术与发展,2016,26(7):1-5,5.基金项目
国家自科基金面上项目(71473114) (71473114)