| 注册
首页|期刊导航|计算机应用研究|IABS:一个基于Spark的Apriori改进算法

IABS:一个基于Spark的Apriori改进算法

闫梦洁 罗军 刘建英 侯传旺

计算机应用研究2017,Vol.34Issue(8):2274-2277,4.
计算机应用研究2017,Vol.34Issue(8):2274-2277,4.DOI:10.3969/j.issn.1001-3695.2017.08.007

IABS:一个基于Spark的Apriori改进算法

IABS: parallel improved Apriori algorithm based on Spark

闫梦洁 1罗军 1刘建英 1侯传旺1

作者信息

  • 1. 国防科学技术大学 计算机学院,长沙 410073
  • 折叠

摘要

Abstract

Apriori algorithm is one of the most classical algorithm in association rule mining, the core problem is the generation process of frequent itemsets.Firstly, aimed at the existing problems of classical Apriori algorithm, such as it needed to scan the transaction global database for several times and needed to generate candidate itemsets, this paper optimized it by transforming storage structure and eliminating the process of candidate itemsets generation.Then, with the advent of the era of big data, data volume rises with the day, classical Apriori algorithm faces severe challenge.Based on the improved Apriori algorithm and combined with Spark platform, this paper proposed the IABS algorithm, which made full use of Spark, such as in-memory computation, resilient distributed datasets.Compared with already existing similar algorithms, the sizeup and node salability of IABS are validated, as well as, IABS achieves 23.88% performance improvement in average for various benchmarks.Especially, as the growth of data, its performance improvement is more obvious.

关键词

Apriori算法/频繁项集/存储结构转换/Spark/内存计算

Key words

Apriori algorithm/frequent itemset/storage structure transformation/Spark/in-memory computation

分类

信息技术与安全科学

引用本文复制引用

闫梦洁,罗军,刘建英,侯传旺..IABS:一个基于Spark的Apriori改进算法[J].计算机应用研究,2017,34(8):2274-2277,4.

基金项目

国家"863"计划资助项目(2014AA01A302) (2014AA01A302)

计算机应用研究

OA北大核心CSCDCSTPCD

1001-3695

访问量0
|
下载量0
段落导航相关论文