| 注册
首页|期刊导航|兵工自动化|一种基于改进加权LDA模型的敏感词识别模型

一种基于改进加权LDA模型的敏感词识别模型

曾玲 林天余 何秋霞 陈莹 胡娟娟

兵工自动化2025,Vol.44Issue(6):28-32,5.
兵工自动化2025,Vol.44Issue(6):28-32,5.DOI:10.7690/bgzdh.2025.06.007

一种基于改进加权LDA模型的敏感词识别模型

Sensitive Word Recognition Model Based on Improved Weighted LDA Model

曾玲 1林天余 1何秋霞 1陈莹 1胡娟娟1

作者信息

  • 1. 中国南方电网有限责任公司海南电网有限责任公司,海口 570203
  • 折叠

摘要

Abstract

In view of the defects of complex data and low prediction accuracy in the current Internet topic recognition,this paper proposes a sensitive word recognition model based on an improved weighted latent Dirichlet allocation(LDA)model.A corpus of sensitive words in a specific field is established;in order to improve the identification efficiency of sensitive information topics,a coarse-grained text classification is proposed for the corpus;a weighting model is proposed,and more words with low-frequency implicit relations can be found by increasing the distribution weight of words with low co-occurrence frequency but obvious sensitive characteristics;Taking the data crawled by mainstream news websites as an example,the proposed model is verified.The results show that the proposed model can identify and extract more detailed sensitive information topics from each text category,The simulation results further verify the effectiveness and accuracy of the proposed model.

关键词

主题识别/敏感词/自然语言处理/潜在狄利克雷分配

Key words

topic identification/sensitive words/natural language processing/latent Dirichlet distribution

分类

信息技术与安全科学

引用本文复制引用

曾玲,林天余,何秋霞,陈莹,胡娟娟..一种基于改进加权LDA模型的敏感词识别模型[J].兵工自动化,2025,44(6):28-32,5.

兵工自动化

OA北大核心

1006-1576

访问量0
|
下载量0
段落导航相关论文