信息安全研究2024,Vol.10Issue(3):233-240,8.DOI:10.12379/j.issn.2096-1057.2024.03.06
一种基于内容和ERNIE3.0-CapsNet的中文垃圾邮件识别方法
A Chinese Spam Detection Method Based on Content and ERNIE3.0-CapsNet
摘要
Abstract
In order to solve the problems of inadequate word vector representation and limited feature extraction richness in the current Chinese spam recognition methods based on deep learning,this paper proposes an improved recognition model by integrating the ERNIE3.0 pre-training model with the capsule neural network,referred to as ERNIE3.0-CapsNet.For the Chinese spam content text,we leverage ERNIE3.0 to generate a word vector matrix with outstanding memory and reasoning capabilities,along with rich semantics.Subsequently,we employ the capsule neural network for feature extraction and classification.For the capsule neural network,we enhance its structure,adopting GELU as the activation function of its dynamic routing,and conduct a comparative experiment between five groups of similar models and four groups of activation functions.On the open source TREC06C Chinese email dataset,the proposed ERNIE3.0-CapsNet model exhibits remarkable overall performance,achieving an accuracy rate of 99.45%.The experimental results demonstrate the superiority of ERNIE3.0-CapsNet over methods such as ERNIE3.0-TextCNN,ERNIE3.0-RNN confirming the model's effectiveness and superiority in Chinese spam recognition.关键词
中文垃圾邮件/ERNIE3.0/胶囊神经网络/激活函数/文本分类Key words
Chinese spam/ERNIE3.0/capsule neural network/activation function/text classification分类
通用工业技术引用本文复制引用
单晨棱,张新有,邢焕来,冯力..一种基于内容和ERNIE3.0-CapsNet的中文垃圾邮件识别方法[J].信息安全研究,2024,10(3):233-240,8.基金项目
国家自然科学基金项目(62172342) (62172342)