计算机应用研究2009,Vol.26Issue(8):2952-2955,4.DOI:10.3969/j.issn.1001-3695.2009.08.044
数字图书馆主题搜索引擎的设计与实现
Design and implementation of search engine system for digital library
摘要
Abstract
This paper advanced the total system design for topic-specific search engine of digital library.It made use of a pretreatment system to select the seed station with high quality, thus giving Web topic defined data. Every topic crawler collected synchronistically Web resource recommended by crawlers with regulation of system controller,then classified text and identified topic in download resource, which was stored into Web topic resource database according to discipline classification.Others could search the topic resource through the index of whole information database.According to every specially characterist of digital library,this paper brang up the design for topic-specific crawler of multi-thread, and gave anovel URL pruning algorithm-EPR,for the design to realize topic-specific search engine prototype of digital library. Lucene-based open-source platform for the expansion of the system and the formation of the final system,the experiment results show that the research work of this article is effective,especially in EPR algorithm, which are really creative and valuable in real application environment.关键词
数字图书馆/主题/爬行器/搜索引擎/EPR算法Key words
digital library/topic-specific/crawler/search engines/algorithm-EPR分类
信息技术与安全科学引用本文复制引用
林其东,陈传波,郑乐丹,张一曼..数字图书馆主题搜索引擎的设计与实现[J].计算机应用研究,2009,26(8):2952-2955,4.基金项目
温州大学校级科研基金资助项目(2007L029) (2007L029)