首页|期刊导航|计算机技术与发展|基于Hadoop的关联规则挖掘算法研究--以Apriori算法为例

基于Hadoop的关联规则挖掘算法研究--以Apriori算法为例

刘木林朱庆华

计算机技术与发展2016，Vol.26Issue(7)：1-5,5.

计算机技术与发展2016，Vol.26Issue(7)：1-5,5.DOI:10.3969/j.issn.1673-629X.2016.07.001

基于Hadoop的关联规则挖掘算法研究--以Apriori算法为例

Research on Association Rules Mining Algorithm Based on Hadoop-Taking Apriori as an Example

刘木林 ¹朱庆华¹

作者信息

1. 南京大学信息管理学院，江苏南京 210023
折叠

摘要

Abstract

In order to solve the problem that the traditional association rules mining algorithm has been unable to meet the mining needs of large amount of data in the aspect of efficiency and scalability,take Apriori as an example,the algorithm is realized in the parallelization based on Hadoop framework and MapReduce model. On the basis,it is improved using the transaction reduce method for further enhance-ment of the algorithm's mining efficiency. The experiment,which consists of verification of parallel mining results,comparison on effi-ciency between serials and parallel,variable relationship between mining time and node number and between mining time and data a-mounts,is carried out in the mining results and efficiency by Hadoop clustering. Experiments show that the paralleled Apriori algorithm implemented is able to accurately mine frequent item sets,with a better performance and scalability. It can be better to meet the require-ments of big data mining and efficiently mine frequent item sets and association rules from large dataset.

关键词

数据挖掘/关联规则/Hadoop/Apriori

Key words

data mining/association rules/Hadoop/Apriori

分类

信息技术与安全科学

引用本文复制引用

刘木林,朱庆华..基于Hadoop的关联规则挖掘算法研究--以Apriori算法为例[J].计算机技术与发展,2016,26(7):1-5,5.

基金项目

国家自科基金面上项目(71473114) （71473114）

计算机技术与发展

OACSTPCD

ISSN：1673-629X

访问量0

下载量0

段落导航