|国家科技期刊平台
首页|期刊导航|现代信息科技|基于有监督双词主题模型的短文本分类方法

基于有监督双词主题模型的短文本分类方法OA

A Short Text Classification Method Based on Supervised Biterm Topic Model

中文摘要英文摘要

针对短文本存在的语义稀疏及语义模糊等问题,提出一种有监督的双词主题模型(Su-BTM),将其应用于短文本分类.在BTM主题模型的基础上引入主题-类别分布参数,识别主题-类别语义信息,建立主题与类别的准确映射,并提出Su-BTM-Gibbs主题采样方法,对每个词的隐含主题进行采样.在两个中英文短文本数据集上进行对比实验,实验结果表明,该方法相比经典模型具有更优的分类效果.

In response to the problems of semantic sparsity and ambiguity in short texts,this paper proposes a Supervised Biterm Topic Model(Su-BTM)and applies it to short text classification.Based on the BTM topic model,distribution parameter between topic and category is introduced to identify semantic information between topic and category,accurate mapping between topic and category is established,and a Su-BTM-Gibbs topic sampling method is proposed to sample the implied topics of each word.Comparative experiments are conducted on two datasets of Chinese and English short texts,and the results show that this method has better classification performance compared to classical models.

卫红敏

山东华宇工学院,山东 德州 253034

计算机与自动化

语义稀疏BTM主题模型隐含主题短文本分类

semantic sparsityBTM topic modelimplied topicshort text classification

《现代信息科技》 2024 (010)

56-59 / 4

2022年山东华宇工学院科技计划项目(2022KJ13)

10.19850/j.cnki.2096-4706.2024.10.012

评论