华中科技大学学报(自然科学版)2013,Vol.41Issue(8):55-60,6.
高维度的数据强跳跃显露模式挖掘方法研究
An efficient method to mine strong jumping emerging patterns from high-dimensional datasets
摘要
Abstract
The contrast pattern tree (CP-tree) algorithm of mining strong jumping emerging patterns (SJEPs) only works for low-dimensional datasets efficiently.An efficient method for mining SJEPs in high-dimensional datasets was proposed.Firstly,a dynamic contrast pattern tree (DCP-tree) structure for storing grown patterns and their crucial information was designed.Then,an initial DCP-tree was constructed to store frequent items and their bit strings in the positive and negative class.Finally,an algorithm based on the initial DCP-tree for discovering SJEPs was developed.Experiments were performed on real cancer datasets with high-dimensional genes and the proposed method was compared with the CP-tree and the improved CP-tree methods.The results show that the proposed method is substantially faster,and able to effectively handle higher-dimensional datasets.Within an acceptable amount of time,the method is able to mine more important SJEPs which are not discovered by the CPtree and the improved CP-tree methods.关键词
数据挖掘/强跳跃显露模式/对照模式树/频繁模式/模式修剪Key words
data mining/ strong jumping emerging pattern/ contrast pattern tree/ frequent patterns/pattern pruning分类
信息技术与安全科学引用本文复制引用
刘全中,聂艳明,宁纪锋..高维度的数据强跳跃显露模式挖掘方法研究[J].华中科技大学学报(自然科学版),2013,41(8):55-60,6.基金项目
国家自然科学基金资助项目(61003151) (61003151)
中央高校基本科研业务费专项资金资助项目(QN2012033,QN2013053). (QN2012033,QN2013053)