首页|期刊导航|信息安全研究|一种基于内容和ERNIE3.0-CapsNet的中文垃圾邮件识别方法

一种基于内容和ERNIE3.0-CapsNet的中文垃圾邮件识别方法

单晨棱张新有邢焕来冯力

信息安全研究2024，Vol.10Issue(3)：233-240,8.

信息安全研究2024，Vol.10Issue(3)：233-240,8.DOI:10.12379/j.issn.2096-1057.2024.03.06

一种基于内容和ERNIE3.0-CapsNet的中文垃圾邮件识别方法

A Chinese Spam Detection Method Based on Content and ERNIE3.0-CapsNet

单晨棱 ¹张新有 ²邢焕来 ²冯力³

作者信息

1. 西南交通大学唐山研究院河北唐山 063000
2. 西南交通大学唐山研究院河北唐山 063000||西南交通大学计算机与人工智能学院成都 611756
3. 西南交通大学计算机与人工智能学院成都 611756
折叠

摘要

Abstract

In order to solve the problems of inadequate word vector representation and limited feature extraction richness in the current Chinese spam recognition methods based on deep learning,this paper proposes an improved recognition model by integrating the ERNIE3.0 pre-training model with the capsule neural network,referred to as ERNIE3.0-CapsNet.For the Chinese spam content text,we leverage ERNIE3.0 to generate a word vector matrix with outstanding memory and reasoning capabilities,along with rich semantics.Subsequently,we employ the capsule neural network for feature extraction and classification.For the capsule neural network,we enhance its structure,adopting GELU as the activation function of its dynamic routing,and conduct a comparative experiment between five groups of similar models and four groups of activation functions.On the open source TREC06C Chinese email dataset,the proposed ERNIE3.0-CapsNet model exhibits remarkable overall performance,achieving an accuracy rate of 99.45％.The experimental results demonstrate the superiority of ERNIE3.0-CapsNet over methods such as ERNIE3.0-TextCNN,ERNIE3.0-RNN confirming the model's effectiveness and superiority in Chinese spam recognition.

关键词

中文垃圾邮件/ERNIE3.0/胶囊神经网络/激活函数/文本分类

Key words

Chinese spam/ERNIE3.0/capsule neural network/activation function/text classification

分类

通用工业技术

引用本文复制引用

单晨棱,张新有,邢焕来,冯力..一种基于内容和ERNIE3.0-CapsNet的中文垃圾邮件识别方法[J].信息安全研究,2024,10(3):233-240,8.

基金项目

国家自然科学基金项目(62172342) （62172342）

信息安全研究

OA北大核心CSTPCD

ISSN：2096-1057

访问量7

下载量0

段落导航