| 注册
首页|期刊导航|郑州大学学报(理学版)|嵌套数据记录列表页的Web信息抽取

嵌套数据记录列表页的Web信息抽取

李贵 张琪 郑新录 韩子扬 李征宇

郑州大学学报(理学版)2011,Vol.43Issue(2):20-23,4.
郑州大学学报(理学版)2011,Vol.43Issue(2):20-23,4.

嵌套数据记录列表页的Web信息抽取

Web Information Extraction Based on List Pages of Nested Data

李贵 1张琪 1郑新录 1韩子扬 1李征宇1

作者信息

  • 1. 沈阳建筑大学,计算机应用技术系,辽宁,沈阳,110168
  • 折叠

摘要

Abstract

On the basis of the existing algorithms of the nested data, the data mining algorithm was joined. According to the tag trees of constructed nested list pages, all data regions were found and unified handled. Then a global pattern was produced after all the subtrees were matched based on partial tree aligning algorithm. And all the data records were extracted. Compared with the original algorithm, the efficiency was improved by using the new method, and it ensured the accuracy.

关键词

嵌套数据/列表页/标签树/数据区域/全局模式

分类

信息技术与安全科学

引用本文复制引用

李贵,张琪,郑新录,韩子扬,李征宇..嵌套数据记录列表页的Web信息抽取[J].郑州大学学报(理学版),2011,43(2):20-23,4.

基金项目

辽宁省自然科学基金资助项目,编号20071004. ()

郑州大学学报(理学版)

OA北大核心CSTPCD

1671-6841

访问量0
|
下载量0
段落导航相关论文