首页|期刊导航|数字技术与应用|基于改进DSE算法的web信息抽取

基于改进DSE算法的web信息抽取

张冬梅陈钊陈剑

数字技术与应用Issue(3)：171-173,3.

基于改进DSE算法的web信息抽取

Information Extraction from Web Pages Based on Improved DSE Algorithm

张冬梅 ¹陈钊 ¹陈剑¹

作者信息

1. 北京林业大学信息学院,北京100083
折叠

摘要

Abstract

Along with the rapid development of Internet technology and,more and more people begin to realize the importance of internet as a huge information source.The most important problem to solve in web information extraction is extracting and organizing the information from the internet automatically and effectively.Based on the DSE algorithm and the RoadRunner system to explore and improve the algorithm,we propose a new automated information extraction methods to generate the template and the template page with the url in determining the threshold into a bioinformatics approach in the FDR for the determination of the threshold proposed theoretical basis.Experimental results show that the improved extraction method for the extraction of the accuracy of the results of significant improvement.

关键词

信息抽取/模板/DSE/RoadRunner/文档对象模型

Key words

information extraction/template/DSE/RoadRunner/document object model

分类

信息技术与安全科学

引用本文复制引用

张冬梅,陈钊,陈剑..基于改进DSE算法的web信息抽取[J].数字技术与应用,2012,(3):171-173,3.

基金项目

中央高校基本科研业务费专项资金资助（）

数字技术与应用

ISSN：1007-9416

访问量0

下载量0

段落导航