首页|期刊导航|南京师大学报（自然科学版）|一种新的基于FP_Growth的频繁项目集并行挖掘算法

一种新的基于FP_Growth的频繁项目集并行挖掘算法

孙鸿艳吉根林

南京师大学报（自然科学版）2016，Vol.39Issue(4)：19-24,6.

南京师大学报（自然科学版）2016，Vol.39Issue(4)：19-24,6.DOI:10.3969/j.issn.1001-4616.2016.04.005

一种新的基于FP_Growth的频繁项目集并行挖掘算法

New Parallel Algorithm for Mining Frequent Item Sets Based on FP_Growth

孙鸿艳 ¹吉根林¹

作者信息

1. 南京师范大学计算机科学与技术学院,江苏南京210023
折叠

摘要

Abstract

Mining of frequent item sets is used to find the association rules between items.In order to get frequent item sets of big data efficiently,this paper proposes a new parallel algorithm for mining frequent item sets based on FP_ Growth,named NPFP_Growth(New Parallel algorithm based on FP_Growth).The storage structure of local frequent pat tern tree is improved and created in each node based on parallel computing model Map/Reduce and distributed storage system HDFS,and then longest global frequent item sets are mined in each branch of the tree.Finally,Support for item sets which does not meet global minimum support is computed and then sent to corresponding computing node to count.Parallel mining algorithm NPFP_Growth is implemented.The experimental results show that the algorithm have high computing efficiency and good scalability.

关键词

频繁项目集/关联规则/FP_Growth/Hadoop/Map/Reduce

Key words

frequent item sets/association rule/FP_Growth/Hadoop/Map/Reduce

分类

信息技术与安全科学

引用本文复制引用

孙鸿艳,吉根林..一种新的基于FP_Growth的频繁项目集并行挖掘算法[J].南京师大学报（自然科学版）,2016,39(4):19-24,6.

基金项目

国家自然科学基金(41471371). （41471371）

南京师大学报（自然科学版）

OA北大核心CSCDCSTPCD

ISSN：1001-4616

访问量0

下载量0

段落导航