南京大学学报(自然科学版)Issue(4):526-533,8.DOI:10.13232/j.cnki.jnju.2014.04.017
一种基于语义词典的局部查询扩展方法
A local query expansion method based on semantic dictionary
摘要
Abstract
Most traditional search engine models are based on keyword matching.Due to the large number of synonyms and polysemous words,the query results obtained by traditional search engines have a big probability to be different from what the user expected,especially when the length of query words is short.To overcome this problem,this paper proposes a new query method based on local query expansion technology and semantic dictionary.Firstly,initial document set is obtained by query with original keywords.And the documents most related to the original keywords are selected as extended-keyword-selection documents.By co-occurrence analysis,words with large weights are selected as extended keyword candidates from the extended-keyword-selection documents. Tongyici Cilin(Extended Edition)is used as the semantic dictionary in this paper.According to the characteristic of the encoding style of Tongyici Cilin(Extended Edition),a new measurement of word similarity is defined.And it is applied to select extended keywords from the extended keyword candidates.The original keywords and the extended keywords are used as the final query words.To get better retrieval results,each word in the final query word set is assigned a weight based on its importance in the query and its similarity to the original keyword.The similarities between the set of final query words and the initial documents are calculated based on the weights of words in the final query word set.And the final retrieval results are sorted according to the similarities between the set of final query words and the initial documents.Comparing with other local query expansion methods,the proposed method avoids adding unrelated words to the query.To test the effectiveness of the proposed method,it is applied to food in-formation retrieval.The proposed method is compared with the method using the original keywords only,and the method using extended words obtained only by co-occurrence analysis.The results show that,comparing with the other two methods,the proposed method effectively improves the precision of retrieval results.关键词
查询扩展/语义词典/共现分析/语义相似度Key words
query expansion/semantic dictionary/co-occurrence analysis/word similarity引用本文复制引用
吴秦,白玉昭,梁久祯..一种基于语义词典的局部查询扩展方法[J].南京大学学报(自然科学版),2014,(4):526-533,8.基金项目
国家自然科学基金(61202312,61170121),教育部留学回国人员科研启动基金 (61202312,61170121)