计算机应用与软件2018,Vol.35Issue(2):91-95,5.DOI:10.3969/j.issn.1000-386x.2018.02.016
全局模式下的深网数据抽取与挖掘
DEEP WEB DATA EXTRACTION AND MINING IN GLOBAL MODE
摘要
Abstract
With the rapid development of modern information, deep web as a network data carrier contains a large amount of data,so it is very important to study the deep web data extraction.In this paper,a method of data extraction and mining in global pattern was proposed.The method analyzed the properties of the actual examples, used the improved Bayesian belief network algorithm, determined the corresponding labels, and constructed a dynamic global pattern.It extracted and identified the data in the result page,detected and removed the useless information based on the density-based outliers.We used the algorithm of mining the frequent itemsets of Boolean association rules to mine the data.Experimental results showed that the proposed method extracted data accurately,quickly and effectively compared with other methods,and through the data mining,the data item had strong relevance and less invalid information.关键词
深网/全局模式/数据抽取/数据挖掘Key words
Deep web/Global schema/Data extraction/Data mining分类
信息技术与安全科学引用本文复制引用
姚晓鹏,高圣兴,薛君志,陆敏超..全局模式下的深网数据抽取与挖掘[J].计算机应用与软件,2018,35(2):91-95,5.基金项目
上海市临港地区智能制造产业专项(ZN2016020103). (ZN2016020103)