首页|期刊导航|计算机工程|基于DOM树的视频元数据抽取系统

基于DOM树的视频元数据抽取系统

唐朝伟李俊苗光胜杜欣慧

计算机工程2012，Vol.38Issue(8)：268-270,3.

计算机工程2012，Vol.38Issue(8)：268-270,3.DOI:10.3969/j.issn.1000-3428.2012.08.085

基于DOM树的视频元数据抽取系统

Video Metadata Extraction System Based on DOM Tree

唐朝伟 ¹李俊 ¹苗光胜 ²杜欣慧¹

作者信息

1. 重庆大学通信工程学院,重庆400044
2. 中国科学院声学研究所高性能网络实验室,北京100190
折叠

摘要

Abstract

Most of the extraction methods mainly focus on the extraction of the subject information block, and pay no attention on the individual information piece. A video metadata extraction system based on DOM tree is proposed to solve this problem. Combining with the node type of Web DOM tree, it extracts the metadata of Web pages thorough individual subject information block by improving the links filter functions of Heritrix and queue management strategy of URL. Experimental results show that the average precision ratio of Web page and the average extraction ratio of the system are 95.7% and 98.4%, greatly higher than the similar systems.

关键词

网络爬虫/信息采集/URL调度/增量更新/DOM树

Key words

Web crawler/information collection/URL schedule/incremental update/DOM tree

分类

信息技术与安全科学

引用本文复制引用

唐朝伟,李俊,苗光胜,杜欣慧..基于DOM树的视频元数据抽取系统[J].计算机工程,2012,38(8):268-270,3.

基金项目

国家科技重大专项基金资助项目(2011ZX002-4,2011ZX03002-005-02) （2011ZX002-4,2011ZX03002-005-02）

重庆大学研究生教育改革基金资助项目(2010JGXM015) （2010JGXM015）

计算机工程

OACSCDCSTPCD

ISSN：1000-3428

访问量0

下载量0

段落导航