计算机工程与科学2018,Vol.40Issue(1):15-23,9.DOI:10.3969/j.issn.1007-130X.2018.01.003
PFPonCanTree:一种基于MapReduce的并行频繁模式增量挖掘算法
PFPonCanTree: A parallel frequent patterns incremental mining algorithm based on MapReduce
摘要
Abstract
Frequent pattern mining is one of the most important data mining tasks.Traditional frequent pattern mining algorithmsare executed in a "batch" mode,that is,all the data are mined in one time,so they cannotmeet the needs of the ever-growing bigdata mining.MapReduce is a popular parallel computing modeland has been widely used in the field of parallel data mining.In this paper,we migrate the traditional frequent pattern incremental mining algorithm CanTree to the MapReduce computing model,achieving a parallel frequent pattern incremental miningalgorithm.The experimental results show that the proposed algorithm achievesbetterload balancing and improvesthe execution efficiency significantly.关键词
数据挖掘/频繁模式挖掘/增量挖掘/MapReduce/Hadoop/PFPKey words
data mining/frequent pattern mining/incremental mining/MapReduce/Hadoop/PFP分类
信息技术与安全科学引用本文复制引用
肖文,胡娟,周晓峰..PFPonCanTree:一种基于MapReduce的并行频繁模式增量挖掘算法[J].计算机工程与科学,2018,40(1):15-23,9.基金项目
安徽省高校自然科学研究项目(KJ2016A623) (KJ2016A623)