| 注册
首页|期刊导航|计算机工程|基于Nodeset的最大频繁项集挖掘算法

基于Nodeset的最大频繁项集挖掘算法

林晨 顾君忠

计算机工程2016,Vol.42Issue(12):204-207,216,5.
计算机工程2016,Vol.42Issue(12):204-207,216,5.DOI:10.3969/j.issn.1000-3428.2016.12.035

基于Nodeset的最大频繁项集挖掘算法

Maximal Frequent Itemset Mining Algorithm Based on Nodeset

林晨 1顾君忠1

作者信息

  • 1. 华东师范大学 计算机科学技术系,上海 200241
  • 折叠

摘要

Abstract

The major performance bottlenecks of most maximal frequent itemset mining algorithms based on FP-Tree are caused by recursively traversing and constructing conditional FP-Trees and superset check.Therefore,this paper proposes a maximal frequent itemset mining algorithm based on Nodeset named MFIN.Instead of recursively traversing and constructing conditional FP-Trees,this algorithm adopts a novel data structure called Nodeset to encode the nodes of POC-Tree and uses a set-enumeration tree to represent search space.Besides,this paper proposes an early-stopping method to minimize the cost of intersection operation of Nodesets and adopts parent equivalence pruning technique and look-ahead pruning technique to reduce the search space.The efficiency of superset check method is promoted by improving the projection strategy based on MFI-Tree.Experimental results show that the performance of MFIN algorithm is superior to the classic FP-Max algorithm which is based on FP-Tree in running time and efficiency on mushroom,pumsb and webdocs datasets.

关键词

最大频繁项集/关联规则/剪枝技术/前缀树/超集检测

Key words

maximal frequent itemset/association rule/pruning technique/prefix tree/superset check

分类

信息技术与安全科学

引用本文复制引用

林晨,顾君忠..基于Nodeset的最大频繁项集挖掘算法[J].计算机工程,2016,42(12):204-207,216,5.

基金项目

上海市国际科技合作项目(13430710100) (13430710100)

上海市科委科技创新行动计划项目(13511506201). (13511506201)

计算机工程

OA北大核心CSCDCSTPCD

1000-3428

访问量7
|
下载量0
段落导航相关论文