计算机工程与应用Issue(2):129-132,161,5.DOI:10.3778/j.issn.1002-8331.1203-0223
一种融合词语位置特征的Lucene相似度评分算法
Scoring algorithm of similarity based on terms’position feature combina-tion for Lucene
摘要
Abstract
The scoring algorithm of similarity is one of the core parts in Lucene. After the analysing and researching on the default scoring algorithm of Lucene similarity, this paper proposes an improved algorithm aimed at the deficiency of the Lucene’s default algorithm which only considers the frequencies rather than the position of query terms occurrence. The improved algorithm combines the feature of the terms’position relationship with Lucene’s default scoring algorithm of similarity. The experiment on the TREC dataset shows that, the improved algorithm increases the value of evaluation metric MAP and P@n to a certain extent.关键词
Lucene/相似度/全文检索Key words
Lucene/similarity/full text search分类
信息技术与安全科学引用本文复制引用
白培发,王成良,徐玲..一种融合词语位置特征的Lucene相似度评分算法[J].计算机工程与应用,2014,(2):129-132,161,5.基金项目
中央高校基本科研业务费科研专项(No.CDJZR11090001)。 ()