基于多尺度卷积和注意力机制的伪造域名检测OACSTPCD
Forgery Domain Names Detection with Multi-Scale Convolution and Attention Mechanism
现有恶意域名检测模型主要利用字符和单词特征构造分类器,极易导致新出现或新变种等伪造域名的漏报.因此,提出一种基于多尺度卷积和注意力机制的伪造域名检测算法.首先,利用长短时记忆神经网络(Long Short-Term Memory,LSTM)改进的Transformer编码器,细粒度地捕获域名字符串中的多尺度特征;然后,利用注意力机制将多尺度特征融合,深层次提取域名字符串在空间和时间序列维度上的特征信息;最后,引入强化学习算法端到端的优化模型.在多个开源伪造域名集上进行测试,实验结果表明,所提模型在合法域名和伪造域名的二分类任务中可以实现98.03%的Accuracy、97.91%的Preci-sion、2.01%的FPR、1.55%的FNR和 98.18%的F1-score,在多家族伪造域名的多分类任务中表现同样良好.
In order to tackle the problems that existing malicious domain name detection methods mainly use characters and word features to construct classifiers,which can easily lead to false negative of new generation or new varieties of forgery domain names.A forgery do-main names detection with multi-scale convolution and attention mechanism is proposed.Firstly,Transformer encoder improved by long short-term memory(LSTM)is used to capture multi-scale features of domain name string fine-grainedly.Then,attention mechanism is uti-lized to fuse the multi-scale features and extract the feature information of domain name strings in the space and time sequence.Finally,reinforcement learning algorithm is introduced to optimize the proposed model in the end-to-end manner.The result of experiments on open-source forgery domain datasets shows that the proposed method can achieve 98.03% Accuracy,97.91% Precision,2.01% FPR,1.55% FNR and 98.18% F1-score in the binary classification task of normal domain names and forgery domain names.It also has the same obser-vation that the proposed method has better performance in the multi-classification task of multi-family forgery domain names.
马伟;谢莉萍;惠巧娟
宁夏大学新华学院信息与计算机科学系,宁夏 银川 750021银川科技学院信息工程学院,宁夏 银川 750021
计算机与自动化
伪造域名检测模型多尺度卷积注意力机制Transformer编码器
forgery domain name detection modelmulti-scale convolutionattention mechanismTransformer encoder
《电子器件》 2024 (004)
922-928 / 7
宁夏自然科学基金项目(2022AAC03642,2023AAC03388);宁夏教育厅产教融合项目(18SFZY29);宁夏高等学校科学研究项目(NYG2024288)
评论