计算机工程2019,Vol.45Issue(3):273-277,5.DOI:10.19678/j.issn.1000-3428.0051615
基于改进TextRank算法的中文文本摘要提取
Extraction of Chinese Text Summarization Based on Improved TextRank Algorithm
摘要
Abstract
This paper proposes a Chinese text summarization extraction algorithm, called DK-TextRank, combines Doc2Vec model, K-means and TextRank algorithm for Chinese texts to improve summarization accuracy.After using the Doc2Vec model for text vectorization, the DK-TextRank algorithm uses an improved K-means algorithm for similar text clustering, and the TextRank algorithm with weight impact factors in each cluster to sort and extract topic sentence.Then, it generates a summary.Experimental results show that, compared with traditional TF-IDF, TextRank algorithm, the DK-TextRank algorithm has an F value of 79.36% when the number of summary statements is 7, and the extracted abstract has higher quality.关键词
Doc2Vec模型/K-means算法/TextRank算法/摘要提取/权重影响因子Key words
Doc2Vec model/K-means algorithm/TextRank algorithm/summarization extraction/weight influence factor分类
信息技术与安全科学引用本文复制引用
徐馨韬,柴小丽,谢彬,沈晨,王敬平..基于改进TextRank算法的中文文本摘要提取[J].计算机工程,2019,45(3):273-277,5.基金项目
国家部委基金. ()