计算机应用研究2025,Vol.42Issue(9):2644-2652,9.DOI:10.19734/j.issn.1001-3695.2025.02.0035
多维度损失函数下引导词协同sinkhorn的监督式主题模型
Guided words and collaborative sinkhorn in supervised topic model under multi-dimensional loss function
摘要
Abstract
The topic model analyzes large volumes of data to uncover latent thematic structures and semantic relationships,cate-gorizing and generalizing data according to themes,thereby enhancing information processing efficiency.To address issues faced by existing topic models,such as the collapse of semantic coherence between generated topics,the lack of supervised labels,and the absence of guidance from seed words,this paper proposed a novel supervised topic model incorporating a multidimensional loss function with seed word-guided sinkhorn optimization.This model designed a multidimensional loss function that combined adaptive reconstruction loss,supervision loss,and conditional variational autoencoder loss as cooperative strategies,while also introducing regularization and normalization methods to mitigate the issue of topic semantic collapse.Furthermore,the model inte-grated seed words with the sinkhorn algorithm in the loss function,effectively resolving the issue of theme coverage,ensuring that the model remains focused on specific topics during the generation process and enhancing interpretability.Experimental results demonstrate that this model effectively addresses issues such as topic collapse and biased topic coverage,generating diverse and coherent topics and high-quality document-topic distributions,continually surpassing state-of-the-art baselines.关键词
数据挖掘/监督学习/主题模型/引导词/自适应重构Key words
data mining/supervised learning/topic model/guide word/adaptive reconstruction分类
信息技术与安全科学引用本文复制引用
徐贞顺,王振彪,郑顺国,苏梦瑶,张文豪,唐增金..多维度损失函数下引导词协同sinkhorn的监督式主题模型[J].计算机应用研究,2025,42(9):2644-2652,9.基金项目
宁夏自然科学基金资助项目(2021AAC03217) (2021AAC03217)
宁夏重点研发计划(重点)项目(2023BDE02001) (重点)
银川市校企联合创新项目(2022XQZD009) (2022XQZD009)