计算机工程与应用2016,Vol.52Issue(6):61-66,6.DOI:10.3778/j.issn.1002-8331.1405-0113
结合互信息和主题模型的微博话题发现方法
Microblog hot topic detection based on positive point mutual informa-tion and probabilistic topic model
摘要
Abstract
In order to face the challenges of feature sparsely of short text messages for microblog hot topic detection, this paper proposes a hot topic detection method based on the combination of term mutual information and probabilistic topic model. Symmetric Nonnegative Matrix Factorization(sNMF)is performed on word co-occurrence with word mutual information and the matrix of term-topic matrix is thereafter inferred. Probabilistic Latent Semantic Analysis(pLSA)model is then adopted to model the topic-microblog. The hotness of topic is analyzed and sorted. Experiments show that this method can effectively cluster and detect the hot topics.关键词
词共现矩阵/对称非负矩阵分解/概率潜在语义分析/微博热点话题发现Key words
term co-occurrence matrix/symmetrical nonnegative matrix factorization/probabilistic latent semantic analysis/micro-blog hot topic detection分类
信息技术与安全科学引用本文复制引用
孙曰昕,马慧芳,姚伟,张志昌..结合互信息和主题模型的微博话题发现方法[J].计算机工程与应用,2016,52(6):61-66,6.基金项目
国家自然科学基金(No.61163039,No.61363058);甘肃省教育厅项目(No.2013A-016)。 ()