| 注册
首页|期刊导航|计算机科学与探索|融合词向量的多特征句子相似度计算方法研究

融合词向量的多特征句子相似度计算方法研究

李峰 侯加英 曾荣仁 凌晨

计算机科学与探索2017,Vol.11Issue(4):608-618,11.
计算机科学与探索2017,Vol.11Issue(4):608-618,11.DOI:10.3778/j.issn.1673-9418.1604029

融合词向量的多特征句子相似度计算方法研究

Research on Multi-Feature Sentence Similarity Computing Method with Word Embedding

李峰 1侯加英 2曾荣仁 3凌晨1

作者信息

  • 1. 中国人民解放军后勤科学研究所,北京100166
  • 2. 北京航空航天大学计算机学院,北京100191
  • 3. 昆明理工大学信息工程与自动化学院,昆明650504
  • 折叠

摘要

Abstract

Based on the summarization of sentence similarity computing methods,this paper applies 34 000 pieces of texts of People's Daily to train word vector space model for semantic similarity computing.Then,based on the trained word vector model,this paper designs a multi-feature sentence similarity computing method,which takes both word and sentence structure features into consideration.Firstly,the method takes note of possible effects of the number of overlapping words and word continuity,and then applies word vector model to calculate the semantic similarity of nonoverlapping words.Regarding the aspect of sentence structure,the method takes both overlapping word order and sentence length conformity into consideration.Finally,this paper designs and implements four different sentence similarity calculating methods,and further develops an experimental system.The experimental results show that the method proposed in this paper can get satisfactory results and the combination and optimization upon the features of words and sentence structures can improve the accuracy of sentence similarity calculating.

关键词

词向量/句子相似度/Word2vec/算法设计

Key words

word embedding/sentence similarity/Word2vec/algorithm design

分类

信息技术与安全科学

引用本文复制引用

李峰,侯加英,曾荣仁,凌晨..融合词向量的多特征句子相似度计算方法研究[J].计算机科学与探索,2017,11(4):608-618,11.

基金项目

The National Natural Science Foundation of China under Grant No.61370126(国家自然科学基金) (国家自然科学基金)

the National High Technology Research and Development Program of China under Grant No.2015AA016004(国家高技术研究发展计划(863计划)) (国家高技术研究发展计划(863计划)

the National Social Science Foundation of China under Grant No.15GJ003-154(国家社会科学基金) (国家社会科学基金)

the Fund of the State Key Laboratory of Software Development Environment under Grant No.SKLSDE-2015ZX-16(软件开发环境国家重点实验室探索性自主研究课题基金). (软件开发环境国家重点实验室探索性自主研究课题基金)

计算机科学与探索

OA北大核心CSCDCSTPCD

1673-9418

访问量3
|
下载量0
段落导航相关论文