| 注册
首页|期刊导航|计算机工程|一种面向词汇突发的连续时间主题模型

一种面向词汇突发的连续时间主题模型

刘良选 黄梦醒

计算机工程2016,Vol.42Issue(11):195-201,7.
计算机工程2016,Vol.42Issue(11):195-201,7.DOI:10.3969/j.issn.1000-3428.2016.11.032

一种面向词汇突发的连续时间主题模型

A Continuous-time Topic Model for Word Burstiness

刘良选 1黄梦醒1

作者信息

  • 1. 海南大学 信息科学技术学院,海口 570228
  • 折叠

摘要

Abstract

To solve the problem that traditional topic models based on multinomial distribution cannot properly capture the condition of word burstiness,a continuous-time topic model with Dirichlet Compound Multinomial(DCM)for word burstiness is proposed,which integrates inherent temporal information in the corpus.In this model,the phenomenon of word burstiness is modeled by DCM distribution,while temporal features are characterized by Beta distribution.Gibbs sampling and fixed-point iteration method are employed to estimate the parameters in the model.Experimental results demonstrate that the model has obvious advantages over ToT and DCMLDA in terms of generalization performance when the given number of topics is small,and it can also effectively reveal the latent evolutions of topics in the corpus.

关键词

主题模型/潜在 Dirichlet分配/词汇突发/Dirichlet组合多项式/Gibbs采样/不动点迭代法

Key words

topic model/Latent Dirichlet Allocation (LDA )/word burstiness/Dirichlet Compound Multinomial (DCM)/Gibbs sampling/fixed-point iteration method

分类

信息技术与安全科学

引用本文复制引用

刘良选,黄梦醒..一种面向词汇突发的连续时间主题模型[J].计算机工程,2016,42(11):195-201,7.

基金项目

国家自然科学基金(61462022)。 ()

计算机工程

OA北大核心CSCDCSTPCD

1000-3428

访问量0
|
下载量0
段落导航相关论文