| 注册
首页|期刊导航|计算机与数字工程|基于文档相似度的双语文档排序学习

基于文档相似度的双语文档排序学习

黄健

计算机与数字工程2017,Vol.45Issue(10):1986-1989,2017,5.
计算机与数字工程2017,Vol.45Issue(10):1986-1989,2017,5.DOI:10.3969/j.issn.1672-9722.2017.10.020

基于文档相似度的双语文档排序学习

Learning to Rank Bilingual Document Based on Document Similarity

黄健1

作者信息

  • 1. 昆明理工大学智能信息处理重点实验室 昆明650500
  • 折叠

摘要

Abstract

The problem of learning to rank bilingual documents is addressed. Ranking is an essential part in information re-trieval.Ranking documents in monolingual context using machine learning has been studied a lot,but learning to rank bilingual doc-uments has not been investigated much yet.Bilingual documents are written in different languages,they can't be processed by using existing monolingual methods directly.In this paper a bilingual learning is proposed to rank model which utilizes monolingual model to give ranking score for documents in monolingual context as a base component.A word embedding approach is introduced to mea-sure document similarity in bilingual context,through which a relationship between documents in both languages can be made.We simply translate the query to foreign language at a phrase level to filter foreign language documents.Experiments show that our mod-el is effective in ranking bilingual documents in both English-Chinese context and English-Vietnamese context.

关键词

排序学习/信息检索/文档相似度/查询翻译/双语语境

Key words

learning to rank/information retrieval/document similarity/query translation/bilingual context

分类

信息技术与安全科学

引用本文复制引用

黄健..基于文档相似度的双语文档排序学习[J].计算机与数字工程,2017,45(10):1986-1989,2017,5.

基金项目

国家自然科学基金项目(编号:61175068,61472168) (编号:61175068,61472168)

云南省关键项目科学基金项目(编号:2013FA130) (编号:2013FA130)

科技部科学技术创新人才项目(编号:2014HE001)资助. (编号:2014HE001)

计算机与数字工程

OACSTPCD

1672-9722

访问量0
|
下载量0
段落导航相关论文