计算机应用与软件2018,Vol.35Issue(3):54-60,7.DOI:10.3969/j.issn.1000-386x.2018.03.010
一种基于LDA和TextRank的文本关键短语抽取方案的设计与实现
DESIGN AND IMPLEMENTION OF A KEY PHRASES EXTRACTION SCHEME IN THE TEXT BASED ON LDA AND TEXTRANK
摘要
Abstract
In order to extract the key words which can better reflect the theme of the text and to solve the problem of lacking of the theme information in the task of extracting the key phrases of the text, a key phrase extraction method based on LDA and TextRank was proposed.This method firstly used the LDA model for topic mining of the text in the corpus and constructed the undirected weighted word graph integrating the subject coverage and words co -occurrence in the target text.Then, by introducing the influence factors of subject nouns, the probability of random jumps between nodes was modified according to the topic relevance of words,and the TextRank algorithm was used to obtain the ranking of candidate keywords based on the word map.Finally,we used the idea of bootstrapping algorithm to iteratively generate more expressive key phrases.Experimental results showed that the method could effectively extract the key phrases which expressed well and covered text subject information.关键词
关键短语抽取/LDA模型/主题挖掘/TextRank/主题影响力Key words
Key phrase extraction/LDA model/Theme mining/TextRank/Theme influence分类
信息技术与安全科学引用本文复制引用
郎冬冬,刘晨晨,冯旭鹏,刘利军,黄青松..一种基于LDA和TextRank的文本关键短语抽取方案的设计与实现[J].计算机应用与软件,2018,35(3):54-60,7.基金项目
国家自然科学基金项目(81360230). (81360230)