现代电子技术2026,Vol.49Issue(5):83-88,96,7.DOI:10.16652/j.issn.1004-373x.2026.05.013
基于CharacterBERT的恶意URL检测模型
CharacterBERT-based malicious URL detection model
摘要
Abstract
Traditional URL detection methods relying on blacklists and heuristic rules exhibit limitations when confronting new URL variants.Although the BERT(bidirectional encoder representations from transformers)model has been introduced into the field of malicious URL detection,it still faces issues like vocabulary dependence,poor handling ability of unlisted vocabulary terms,and insufficient semantic granularity.In view of the above,this paper introduces a novel malicious URL detection model integrating CharacterBERT with URL structural features.In the model,a character-level convolutional neural network(CharacterCNN)is employed to eliminate the dependency on predefined vocabularies,and deformable convolution kernels are used to extract finer semantic information.Additionally,a gated fusion network unit is developed to integrate structural features such as sub-domain quantity,sensitive word,and URL length,so as to enhance the ability to identify malicious URLs.Experimental results show that the datasets Grambeddings and kaggle_1 demonstrate the superior performance of the model,with F1-scores of 97.88%and 99.83%,respectively.To sum up,the proposed model shows outstanding detection performance and has high application value in practical security scenarios.关键词
CharacterBERT/特征融合/恶意URL检测/网络安全/字符级卷积神经网络/金字塔注意力Key words
CharacterBERT/feature fusion/malicious URL detection/cybersecurity/CharacterCNN/pyramid attention分类
信息技术与安全科学引用本文复制引用
王旭,李松朔,姜久雷,乐德广..基于CharacterBERT的恶意URL检测模型[J].现代电子技术,2026,49(5):83-88,96,7.基金项目
国家自然科学基金地区科学基金项目(61762002) (61762002)