计算机工程与应用2019,Vol.55Issue(9):1-9,9.DOI:10.3778/j.issn.1002-8331.1811-0425
基于Spark的并行关联规则挖掘算法研究综述
Survey of Spark-Based Parallel Association Rules Mining Algorithm
摘要
Abstract
Association rule mining is an important branch of data mining. However, with the rapid growth of data, the traditional association rule mining algorithm can not adapt to the requirements of big data well, and it is necessary to find a breakthrough on the platform of distributed and parallel computing. Spark is a parallel computing model suitable for big data processing and suitable for iterative operation. Compared with MapReduce, it has the advantages of more efficient, full utilization of memory, more suitable for iterative calculation and interactive processing. The existing Spark-based parallel association rules mining algorithms are classified and summarized, and their advantages, disadvantages and scope of application are summarized, which provides reference for the next step.关键词
Spark/并行/关联规则挖掘/Apriori/FP-GrowthKey words
Spark/ parallel/ association rule mining/ Apriori/ FP-Growth分类
信息技术与安全科学引用本文复制引用
刘莉萍,章新友,牛晓录,郭永坤,丁亮..基于Spark的并行关联规则挖掘算法研究综述[J].计算机工程与应用,2019,55(9):1-9,9.基金项目
国家自然科学基金(No.81660727). (No.81660727)