| 注册
首页|期刊导航|计算机工程与应用|结合LDA和谱聚类的多文档摘要

结合LDA和谱聚类的多文档摘要

付玲 张晖

计算机工程与应用Issue(16):142-145,154,5.
计算机工程与应用Issue(16):142-145,154,5.DOI:10.3778/j.issn.1002-8331.1202-0458

结合LDA和谱聚类的多文档摘要

Multi-document summary using LDA and spectral clustering

付玲 1张晖1

作者信息

  • 1. 西南科技大学 计算机科学与技术学院,四川 绵阳 621000
  • 折叠

摘要

Abstract

Automatic summarization aims to compress lengthy document into a few short paragraphs, offers comprehensive and concise information to the users and improves the efficiency and accuracy of the information. A summarization method based on Latent Dirichlet Allocation(LDA)is proposed, using Gibbs sampling to estimate the word probability on topics and topic proba-bility on sentences, combing with the LDA parameters and spectral clustering algorithm to extract multi-document summariza-tion. The proposed approach uses a linear formula to integrate the sentence weights, extracting 400-words multi-document sum-marization. The experimental results show that the proposed method can improve the quality of summary effectively with the au-tomatic summarization evaluation toolkit ROUGE on DUC2002.

关键词

Latent Dirichlet Allocation(LDA)/Gibbs抽样/谱聚类/多文档摘要

Key words

Latent Dirichlet Allocation(LDA)/Gibbs sampling/spectral clustering/multi-document summary

分类

信息技术与安全科学

引用本文复制引用

付玲,张晖..结合LDA和谱聚类的多文档摘要[J].计算机工程与应用,2013,(16):142-145,154,5.

基金项目

国家高技术研究发展计划项目(863)(No.2007AA01Z151)。 ()

计算机工程与应用

OACSCDCSTPCD

1002-8331

访问量0
|
下载量0
段落导航相关论文