网络与信息安全学报2024,Vol.10Issue(3):66-80,15.DOI:10.11959/issn.2096-109x.2024040
Excel 4.0宏自动化反混淆与家族分类系统
Automated deobfuscation and family classification system for Excel 4.0 macros
摘要
Abstract
In recent years,a surge has been witnessed in cyber-attacks that leverage malicious Excel 4.0 macros(XLM)within documents.Malicious XLM codes often undergo complex obfuscation,posing a substantial chal-lenge for conventional analysis methods and detection systems to discern the actual functionality within a vast array of samples.Consequently,an automated system for deobfuscating XLM and extracting key Indicators of Compromise(IOCs),named XLMRevealer,was developed to counter the diverse obfuscation strategies employed in malicious samples.XLMRevealer was architected upon abstract syntax trees and execution simulation,encompassing 138 com-prehensive macro function handlers.Based on that,Word and Token features tailored to XLM code peculiarities were extracted,capturing multi-level,fine-grained features through feature fusion.XLMRevealer incorporated a CNNBiL-STM model to discern familial correlations across dimensions,facilitating family classification.Finally,a dataset com-prising 2346 samples from five distinct sources was constructed for both deobfuscation and family classification experi-ments.Results indicated that XLMRevealer achieved a 71.3%deobfuscation success rate,outperforming XLMMacro-Deobfuscator and SYMBEXCEL by 20.8%and 15.8%,respectively.Its efficiency was stable,with an average pro-cessing time of only 0.512 seconds.The family classification accuracy for deobfuscated XLM codes stood at 94.88%,surpassing all baseline models and underscoring the efficacy of Word and Token feature integration.Fur-thermore,to assess the impact of deobfuscation on family classification and account for variability in obfuscation techniques across families,experiments were conducted on both the original and uniformly obfuscated XLM codes.The accuracies were 89.58%and 53.61%,respectively,demonstrating the model's capability to learn obfuscation features and confirming the significant enhancement deobfuscation provides for family classification.关键词
恶意宏文档/Excel 4.0宏/反混淆/家族分类Key words
malicious macro document/Excel 4.0 macro/deobfuscation/family classification分类
信息技术与安全科学引用本文复制引用
李晨光,杨秀璋,彭国军..Excel 4.0宏自动化反混淆与家族分类系统[J].网络与信息安全学报,2024,10(3):66-80,15.基金项目
国家自然科学基金(62172308,U1636107,61972297,62172144),中央网信办网络安全学院学生创新资助计划 The National Natural Science Foundation of China(62172308,U1626107,61972297,62172144),Cyber Security School Student Innovation Funding Program of Office of the Central Cyberspace Affairs Commission (62172308,U1636107,61972297,62172144)