| 注册
首页|期刊导航|计算机应用与软件|一种基于LDA和TextRank的文本关键短语抽取方案的设计与实现

一种基于LDA和TextRank的文本关键短语抽取方案的设计与实现

郎冬冬 刘晨晨 冯旭鹏 刘利军 黄青松

计算机应用与软件2018,Vol.35Issue(3):54-60,7.
计算机应用与软件2018,Vol.35Issue(3):54-60,7.DOI:10.3969/j.issn.1000-386x.2018.03.010

一种基于LDA和TextRank的文本关键短语抽取方案的设计与实现

DESIGN AND IMPLEMENTION OF A KEY PHRASES EXTRACTION SCHEME IN THE TEXT BASED ON LDA AND TEXTRANK

郎冬冬 1刘晨晨 1冯旭鹏 1刘利军 1黄青松1

作者信息

  • 1. 昆明理工大学信息工程与自动化学院 云南昆明650500
  • 折叠

摘要

Abstract

In order to extract the key words which can better reflect the theme of the text and to solve the problem of lacking of the theme information in the task of extracting the key phrases of the text, a key phrase extraction method based on LDA and TextRank was proposed.This method firstly used the LDA model for topic mining of the text in the corpus and constructed the undirected weighted word graph integrating the subject coverage and words co -occurrence in the target text.Then, by introducing the influence factors of subject nouns, the probability of random jumps between nodes was modified according to the topic relevance of words,and the TextRank algorithm was used to obtain the ranking of candidate keywords based on the word map.Finally,we used the idea of bootstrapping algorithm to iteratively generate more expressive key phrases.Experimental results showed that the method could effectively extract the key phrases which expressed well and covered text subject information.

关键词

关键短语抽取/LDA模型/主题挖掘/TextRank/主题影响力

Key words

Key phrase extraction/LDA model/Theme mining/TextRank/Theme influence

分类

信息技术与安全科学

引用本文复制引用

郎冬冬,刘晨晨,冯旭鹏,刘利军,黄青松..一种基于LDA和TextRank的文本关键短语抽取方案的设计与实现[J].计算机应用与软件,2018,35(3):54-60,7.

基金项目

国家自然科学基金项目(81360230). (81360230)

计算机应用与软件

OA北大核心CSTPCD

1000-386X

访问量4
|
下载量0
段落导航相关论文