郑州大学学报(理学版)2018,Vol.50Issue(2):86-91,6.DOI:10.13705/j.issn.1671-6841.2017210
基于自动编码器的句子语义特征提取及相似度计算
Semantic Feature Extraction and Similarity Computation of Sentences Based on Auto-encoder
马建红 1杨浩 1姚爽1
作者信息
- 1. 河北工业大学计算机科学与软件学院 天津300401
- 折叠
摘要
Abstract
The extraction of sentence features and the calculation of similarity were two important issues in the natural language processing field.Currently,the similarity calculation method of Chinese sentences could not take the sentence meanings into consideration comprehensively,and this resulted in the calcula-tion result of similarity was not accurate enough.The thesis aimed to discuss the sentence′s semantic fea-ture based on the deep auto-encoder and the calculation method of the similarity.Firstly, the sentence was expressed in the form of high-dimensional and sparse vectors.Then the high-dimensional and sparse vectors were transformed into low-dimensional vectors by using the auto-encoder to the non-linear feature of sentence′s unsupervisedly.After this repeated dimensionality reduction,the final features of sentences were used to calculate their sentence′s similarity.This was a pure process of end-to-end study to avoid the establishments of stop word list and word segmentation effectively.The experiment result indicated that,the proposed method not only increased the accuracy of similarity calculation, but also made the time complexity be O(n).关键词
自动编码器/无监督特征学习/语义特征提取/相似度计算Key words
auto-encoder/unsupervised feature learning/semantic feature extraction/similarity calcula-tion分类
信息技术与安全科学引用本文复制引用
马建红,杨浩,姚爽..基于自动编码器的句子语义特征提取及相似度计算[J].郑州大学学报(理学版),2018,50(2):86-91,6.