微型机与应用2011,Vol.30Issue(19):67-69,3.
基于树比较的Web页面主题信息抽取
Topic information extraction from Web pages based on tree comparison
摘要
Abstract
In order to automatically extract Web page information from Internet that contains magnanimous information, this paper presented an approach based on tree comparison. This approach compared tree built from the target page with that ones built from its similar pages to simplify the target page. Extraction rules were generated on this basis, and then we used the rules to extract topic information from the target Web page. Experiment result shows this extraction method is precise and efficient.关键词
信息抽取/相似页面/树比较/抽取规则Key words
information extraction/similar pages/tree comparison/extraction rules分类
信息技术与安全科学引用本文复制引用
朱梦麟,李光耀,周毅敏..基于树比较的Web页面主题信息抽取[J].微型机与应用,2011,30(19):67-69,3.基金项目
上海市科委国际合作项目 ()