| 注册
首页|期刊导航|计算机工程|基于查询意图识别与主题建模的文档检索算法

基于查询意图识别与主题建模的文档检索算法

严锐 李石君

计算机工程2018,Vol.44Issue(3):189-194,6.
计算机工程2018,Vol.44Issue(3):189-194,6.DOI:10.3969/j.issn.1000-3428.2018.03.032

基于查询意图识别与主题建模的文档检索算法

Document Retrieval Algorithm Based on Query Intent Identification and Topic Modeling

严锐 1李石君1

作者信息

  • 1. 武汉大学计算机学院,武汉430072
  • 折叠

摘要

Abstract

Conventional search engines collect documents which only contain key words in the query,but not considering the true intent hidden inside its users.Aiming at this problem,taking the document retrieval as a personalized recommendation problem,this paper proposes a personalized retrieval algorithm based on query intent identification and topic model.First,the topic model of Dirichlet Distribution Allocation(LDA) is applied for modeling the historical search data of its user.When a new query comes,latent topic of the query is recognized by the topic model of the historical search of its user,and then appropriate documents are recommended for the correlation of topics.Finally,the KL distance between the query and document sets is calculated,and the documents returning to the user are sorted according to the distance.Experimental results show that the proposed algorithm is better than the method based on collaborative similarity calculation and the method based on user interest clustering on efficiency.

关键词

搜索引擎/查询意图/文档检索/个性化推荐/主题模型/潜在狄利克雷分布/KL距离

Key words

search engine/query intent/document retrieval/personalized recommendation/topic model/Latent Dirichlet Allocation (LDA)/KL distance

分类

信息技术与安全科学

引用本文复制引用

严锐,李石君..基于查询意图识别与主题建模的文档检索算法[J].计算机工程,2018,44(3):189-194,6.

基金项目

国家自然科学基金(61272109) (61272109)

国家自然科学青年基金(61502350). (61502350)

计算机工程

OA北大核心CSCDCSTPCD

1000-3428

访问量0
|
下载量0
段落导航相关论文