计算机工程2016,Vol.42Issue(12):204-207,216,5.DOI:10.3969/j.issn.1000-3428.2016.12.035
基于Nodeset的最大频繁项集挖掘算法
Maximal Frequent Itemset Mining Algorithm Based on Nodeset
摘要
Abstract
The major performance bottlenecks of most maximal frequent itemset mining algorithms based on FP-Tree are caused by recursively traversing and constructing conditional FP-Trees and superset check.Therefore,this paper proposes a maximal frequent itemset mining algorithm based on Nodeset named MFIN.Instead of recursively traversing and constructing conditional FP-Trees,this algorithm adopts a novel data structure called Nodeset to encode the nodes of POC-Tree and uses a set-enumeration tree to represent search space.Besides,this paper proposes an early-stopping method to minimize the cost of intersection operation of Nodesets and adopts parent equivalence pruning technique and look-ahead pruning technique to reduce the search space.The efficiency of superset check method is promoted by improving the projection strategy based on MFI-Tree.Experimental results show that the performance of MFIN algorithm is superior to the classic FP-Max algorithm which is based on FP-Tree in running time and efficiency on mushroom,pumsb and webdocs datasets.关键词
最大频繁项集/关联规则/剪枝技术/前缀树/超集检测Key words
maximal frequent itemset/association rule/pruning technique/prefix tree/superset check分类
信息技术与安全科学引用本文复制引用
林晨,顾君忠..基于Nodeset的最大频繁项集挖掘算法[J].计算机工程,2016,42(12):204-207,216,5.基金项目
上海市国际科技合作项目(13430710100) (13430710100)
上海市科委科技创新行动计划项目(13511506201). (13511506201)