基于频繁模式半结构化数据的模式抽取OACSTPCD
Semi-Structured Data Model Extraction Based on Frequent Patterns
为克服半结构化数据存储复杂的缺点,提出一种基于动态树的半结构化的存储模型.对该模型进行模式抽取,并将其引入到Apriori算法.通过设置最小支持度阀值过滤掉不必要的信息,输出最长频繁路径的集合,以实现半结构化数据的提取.实验结果表明,该算法能同时有效地处理分支及环路问题,避免了死循环的出现.
In order to overcome the complex characteristics of semi-structured data storage, we propose a semi-structured storage model based on dynamic tree. We extract mode by introducing the mode into the Apriori algorithm, and setting the minimum support threshold filter unnecessary information to output the longest frequent path collection. Experimental results show that this algorithm deal effectively with the branch and loop part at the same time, and also it ca…查看全部>>
李颖;张晓贤;孙佳慧
吉林师范大学计算机学院,吉林四平136000长春工程学院软件学院,长春130012空军航空大学基础部,长春130022
信息技术与安全科学
半结构化数据数据挖掘频繁模式模式抽取
semi-structured data data mining frequent patterns mining extracting schema
《吉林大学学报(信息科学版)》 2012 (5)
540-543,4
评论