计算机工程2012,Vol.38Issue(8):268-270,3.DOI:10.3969/j.issn.1000-3428.2012.08.085
基于DOM树的视频元数据抽取系统
Video Metadata Extraction System Based on DOM Tree
摘要
Abstract
Most of the extraction methods mainly focus on the extraction of the subject information block, and pay no attention on the individual information piece. A video metadata extraction system based on DOM tree is proposed to solve this problem. Combining with the node type of Web DOM tree, it extracts the metadata of Web pages thorough individual subject information block by improving the links filter functions of Heritrix and queue management strategy of URL. Experimental results show that the average precision ratio of Web page and the average extraction ratio of the system are 95.7% and 98.4%, greatly higher than the similar systems.关键词
网络爬虫/信息采集/URL调度/增量更新/DOM树Key words
Web crawler/information collection/URL schedule/incremental update/DOM tree分类
信息技术与安全科学引用本文复制引用
唐朝伟,李俊,苗光胜,杜欣慧..基于DOM树的视频元数据抽取系统[J].计算机工程,2012,38(8):268-270,3.基金项目
国家科技重大专项基金资助项目(2011ZX002-4,2011ZX03002-005-02) (2011ZX002-4,2011ZX03002-005-02)
重庆大学研究生教育改革基金资助项目(2010JGXM015) (2010JGXM015)