| 注册
首页|期刊导航|计算机应用与软件|Hadoop 下负载均衡的频繁项集挖掘算法研究

Hadoop 下负载均衡的频繁项集挖掘算法研究

朱文飞 齐建东 洪剑珂

计算机应用与软件2016,Vol.33Issue(5):35-39,5.
计算机应用与软件2016,Vol.33Issue(5):35-39,5.DOI:10.3969/j.issn.1000-386x.2016.05.010

Hadoop 下负载均衡的频繁项集挖掘算法研究

RESEARCH ON LOAD BALANCED FREQUENT ITEMSETS MINING ALGORITHM BASED ON HADOOP

朱文飞 1齐建东 1洪剑珂1

作者信息

  • 1. 北京林业大学信息学院 北京 100083
  • 折叠

摘要

Abstract

Frequent itemsets mining (FIM)is an important component of association rules mining algorithms.However,classical Apriori and FP-Growth algorithms face the bottleneck of memory occupation and computation performance when processing massive data.Based on Hadoop cloud computing platform,we proposed the HBFP algorithm of frequent itemsets mining applicable for big data processing,and designed the data partitioning with suffix mode conversion and the balanced tasks grouping scheme.This makes the nodes possess locally the data relyed on by the computation and realises the parallel data mining method with different nodes independent each other,and ensures the global load balancing characteristic of the algorithm.Experimental data indicated that the HBFP algorithm could distribute the calculation load to different computation node uniformly and run FP-Growth mining progress parallelly and mutual-independently.The efficiency of the algorithm raised about 12%,and the global stabilisation and efficiency of the algorithm were promoted as well.

关键词

频繁项集挖掘/FP-Growth 算法/Hadoop/并行计算

Key words

Frequent itemsets mining/FP-Growth/Hadoop/Parallel computing

分类

信息技术与安全科学

引用本文复制引用

朱文飞,齐建东,洪剑珂..Hadoop 下负载均衡的频繁项集挖掘算法研究[J].计算机应用与软件,2016,33(5):35-39,5.

基金项目

国家林业局重点课题(2013-05);十二五科技支撑课题(2011BAH10B04)。 ()

计算机应用与软件

OACSTPCD

1000-386X

访问量0
|
下载量0
段落导航相关论文