计算机工程与应用2018,Vol.54Issue(12):116-120,205,6.DOI:10.3778/j.issn.1002-8331.1612-0277
基于混合策略的中文短文本相似度计算
Chinese short text similarity computation based on hybrid strategy
摘要
Abstract
In order to improve the accuracy of Chinese short text similarity computation, it proposes a new similarity com-putation method for Chinese short text based on hybrid strategy. Firstly, according to the semantic distance of words, by using of hierarchical clustering, it constructs the short text clustering binary tree, and the traditional Vector Space Model (VSM)is improved, the keyword weight text similarity is computed. Then, this paper improves traditional syntax semantic model and computes the semantic similarity of the short text by extracting principal component of sentences. Finally, the two similarities are weighted to calculate the final text similarity. The experimental results show that the proposed method gets better performance in effectiveness and is closer to people's judgment.关键词
短文本相似度/关键词权重/层次聚类/二叉树/主干成分Key words
short text similarity/keyword weight/hierarchical clustering/binary tree/main components分类
信息技术与安全科学引用本文复制引用
宋冬云,郑瑾,张祖平..基于混合策略的中文短文本相似度计算[J].计算机工程与应用,2018,54(12):116-120,205,6.基金项目
国家自然科学基金(No.61379109). (No.61379109)