现代电子技术2024,Vol.47Issue(9):97-103,7.DOI:10.16652/j.issn.1004-373x.2024.09.018
基于文本-视觉多特征融合的非法网站识别研究
Research on illegal website identification based on textual-visual multi-feature fusion
摘要
Abstract
At present,illegal websites have the characteristics of strong concealment and high potential for harm,and the website identification methods relying only on a single feature fail to cope with the complexity effectively.In view of this,an illegal website identification method based on textual-visual multi-feature fusion is introduced.Initially,a visual feature extraction model based on ResNet-18 and a textual feature extraction model based on BERT-CNN are established.Subsequently,a fusion algorithm based on logistic regression(LR)is applied to integrate the classification results of the two models.The final illegal website identification model is refined by multiple iterations of training.The experimental results show that the accuracy of the fusion model constructed is 4%and 11%higher than that of the model relying only on textual feature or visual feature,respectively,so the proposed model can identify illegal websites more accurately.关键词
非法网站识别/多特征融合/BERT/ResNet/CNN/深度学习Key words
illegal website identification/multi-feature fusion/BERT/ResNet/CNN/deep learning分类
信息技术与安全科学引用本文复制引用
熊凌龙,何月顺,陈杰,杜萍,韩鑫豪..基于文本-视觉多特征融合的非法网站识别研究[J].现代电子技术,2024,47(9):97-103,7.基金项目
江西省重点研发计划项目(20224BBC41001) (20224BBC41001)
江西省网络空间安全智能感知重点实验室开放基金资助项目(JKLGIP202206) (JKLGIP202206)