首页|期刊导航|网络与信息安全学报|Excel 4.0宏自动化反混淆与家族分类系统

Excel 4.0宏自动化反混淆与家族分类系统

李晨光杨秀璋彭国军

网络与信息安全学报2024，Vol.10Issue(3)：66-80,15.

网络与信息安全学报2024，Vol.10Issue(3)：66-80,15.DOI:10.11959/issn.2096-109x.2024040

Excel 4.0宏自动化反混淆与家族分类系统

Automated deobfuscation and family classification system for Excel 4.0 macros

李晨光 ¹杨秀璋 ¹彭国军¹

作者信息

1. 武汉大学国家网络安全学院,湖北武汉 430072
折叠

摘要

Abstract

In recent years,a surge has been witnessed in cyber-attacks that leverage malicious Excel 4.0 macros(XLM)within documents.Malicious XLM codes often undergo complex obfuscation,posing a substantial chal-lenge for conventional analysis methods and detection systems to discern the actual functionality within a vast array of samples.Consequently,an automated system for deobfuscating XLM and extracting key Indicators of Compromise(IOCs),named XLMRevealer,was developed to counter the diverse obfuscation strategies employed in malicious samples.XLMRevealer was architected upon abstract syntax trees and execution simulation,encompassing 138 com-prehensive macro function handlers.Based on that,Word and Token features tailored to XLM code peculiarities were extracted,capturing multi-level,fine-grained features through feature fusion.XLMRevealer incorporated a CNNBiL-STM model to discern familial correlations across dimensions,facilitating family classification.Finally,a dataset com-prising 2346 samples from five distinct sources was constructed for both deobfuscation and family classification experi-ments.Results indicated that XLMRevealer achieved a 71.3%deobfuscation success rate,outperforming XLMMacro-Deobfuscator and SYMBEXCEL by 20.8%and 15.8%,respectively.Its efficiency was stable,with an average pro-cessing time of only 0.512 seconds.The family classification accuracy for deobfuscated XLM codes stood at 94.88%,surpassing all baseline models and underscoring the efficacy of Word and Token feature integration.Fur-thermore,to assess the impact of deobfuscation on family classification and account for variability in obfuscation techniques across families,experiments were conducted on both the original and uniformly obfuscated XLM codes.The accuracies were 89.58%and 53.61%,respectively,demonstrating the model's capability to learn obfuscation features and confirming the significant enhancement deobfuscation provides for family classification.

关键词

恶意宏文档/Excel 4.0宏/反混淆/家族分类

Key words

malicious macro document/Excel 4.0 macro/deobfuscation/family classification

分类

信息技术与安全科学

引用本文复制引用

李晨光,杨秀璋,彭国军..Excel 4.0宏自动化反混淆与家族分类系统[J].网络与信息安全学报,2024,10(3):66-80,15.

基金项目

国家自然科学基金(62172308,U1636107,61972297,62172144),中央网信办网络安全学院学生创新资助计划 The National Natural Science Foundation of China(62172308,U1626107,61972297,62172144),Cyber Security School Student Innovation Funding Program of Office of the Central Cyberspace Affairs Commission （62172308,U1636107,61972297,62172144）

网络与信息安全学报

OACSTPCD

ISSN：2096-109X

访问量4

下载量0

段落导航