计算机工程与应用2011,Vol.47Issue(26):146-150,5.DOI:10.3778/j.issn.1002-8331.2011.26.041
一种基于文档拓扑的相似性搜索算法
Topology-based document similarity search algorithm
摘要
Abstract
Searching for similar documents from the large number of documents quickly and efficiently is an important and time-consuming problem.The existing algorithms first find the candidate document set, and then sort them based on a document related evaluation to identify the most relevant ones.A topology-based document similarity search algorithm--Hub-Nis put forward, and the document similarity search problem is transformed into graph search problem, applying the pruning techniques, reducing the scope of scanned documents, and significantly improving retrieval efficiency.lt proves to be effective and feasible through experiment.关键词
文档拓扑/相似性搜索/相似度Key words
document topology similarity search/ similarity分类
信息技术与安全科学引用本文复制引用
杨艳,朱戈,范文彬..一种基于文档拓扑的相似性搜索算法[J].计算机工程与应用,2011,47(26):146-150,5.基金项目
国家自然科学基金(the National Natural Science Foundation of China under Grant No.60973081) (the National Natural Science Foundation of China under Grant No.60973081)
黑龙江省教育厅科学技术研究面上项目(No.11541263,No.11551352). (No.11541263,No.11551352)