| 注册
首页|期刊导航|计算机工程与应用|基于潜在语义索引的科技文献主题挖掘

基于潜在语义索引的科技文献主题挖掘

刘勘 朱芳芳

计算机工程与应用Issue(24):113-117,150,6.
计算机工程与应用Issue(24):113-117,150,6.DOI:10.3778/j.issn.1002-8331.1305-0146

基于潜在语义索引的科技文献主题挖掘

Research of topic mining for scientific papers based on LSI

刘勘 1朱芳芳1

作者信息

  • 1. 中南财经政法大学 信息与安全工程学院,武汉 430073
  • 折叠

摘要

Abstract

Based on a method improved by Latent Semantic Indexing, a topic mining for scientific papers is proposed. This paper describes a process which is used to mine the topics of the scientific papers. It performs conversion, removes non-alphabetic and stop word before further processing. It constructs the term-document matrix based on all words’weight. It uses modified LSI algorithm to cut the dimension of the matrix and gets a new topic-document matrix. It takes the highest weight of the top five themes as the papers’topic. This method utilizes the Frobenius norm to regulate matrix, reducing the dimension of the matrix. So the theme of the scientific papers can be mined quickly and accurately.

关键词

潜在语义索引/主题挖掘/科技文献

Key words

latent semantic indexing/topic modeling/scientific documents

分类

信息技术与安全科学

引用本文复制引用

刘勘,朱芳芳..基于潜在语义索引的科技文献主题挖掘[J].计算机工程与应用,2014,(24):113-117,150,6.

基金项目

国家自然科学基金(No.71203164)。 ()

计算机工程与应用

OA北大核心CSCDCSTPCD

1002-8331

访问量0
|
下载量0
段落导航相关论文