信息安全研究2025,Vol.11Issue(5):412-419,8.DOI:10.12379/j.issn.2096-1057.2025.05.03
基于词嵌入和特征融合的恶意软件检测研究
Research on Malware Detection Based on Word Embedding and Feature Fusion
摘要
Abstract
To address the limitations of traditional methods in feature extraction and representation,which are unable to simultaneously capture the spatial and temporal features of API sequences and fail to capture key features that determine the target task,a malware detection method based on word embedding and feature fusion has been proposed.First,the word embedding technology from the field of natural language processing is utilized to encode API sequences,obtaining their semantic feature representations.Then,multiple convolutional networks and Bi-LSTM networks are employed to extract n-gram local spatial features and temporal features of the API sequences,respectively.Finally,a self-attention mechanism is used to deeply fuse the captured features of critical positions,thereby achieving the classification task by characterizing deep malicious behavior features.Experimental results show that in binary classification tasks,the accuracy of this method reaches 94.79%,which is an improvement of 12.37%on average compared to traditional machine learning algorithms,and 5.78%higher on average compared to deep learning algorithms.In multi-class classification tasks,the accuracy of this model also reaches 91.95%,effectively enhancing the detection accuracy of malware.关键词
恶意软件检测/软件调用序列/多重卷积网络/长短期记忆网络/特征融合Key words
malware detection/software call sequence/multiple convolutional networks/long short term memory network/feature fusion分类
信息技术与安全科学引用本文复制引用
师智斌,孙文琦,窦建民,于孟洋..基于词嵌入和特征融合的恶意软件检测研究[J].信息安全研究,2025,11(5):412-419,8.基金项目
信息网络安全公安部重点实验室(公安部第三研究所)开放课题(C23600-06) (公安部第三研究所)