计算机技术与发展2024,Vol.34Issue(7):62-68,7.DOI:10.20165/j.cnki.ISSN1673-629X.2024.0103
基于数据依赖的跨架构二进制代码相似性分析
Cross-architecture Binary Code Similarity Analysis Based on Data Dependencies
摘要
Abstract
Binary Code Similarity Detection(BCSD)technology plays a pivotal role in various academic applications such as reverse en-gineering,vulnerability detection,malware analysis,software plagiarism,and patch analysis.Most research efforts have predominantly focused on control-flow embedding of binary functions and the exploration of underlying code embedding techniques utilizing Natural Language Processing(NLP)technology.However,it is worth noting that functions encompass not only control-flow information but also data-flow semantic information during their actual execution.Consequently,achieving a comprehensive abstraction of the semantic features of functions becomes crucial.In light of this,we introduce BS-DD,a framework for assessing binary function similarity that in-tegrates both control flow and data dependency relationships.We extract semantic information by simulating the execution of binary code and employ a simplification algorithm to construct a data dependency graph.Finally,we leverage graph neural networks for similarity as-sessment.We compile seven widely used software packages from the open-source community in various combinations and design three distinct task scenarios,including real-world vulnerability detection experiments,to compare the performance of the BS-DD approach with the latest data-flow-based BCSD methods.Experimental results demonstrate significant improvements in recall and Mean Reciprocal Rank(MRR)scores for such model.In real-world vulnerability detection scenarios,such model consistently outperforms other methods.关键词
二进制/数据依赖/相似性检测/图神经网络/语义信息/漏洞检测Key words
binary/data dependency/similarity detection/graph neural network/semantic information/vulnerability detection分类
信息技术与安全科学引用本文复制引用
张际灿,姚锟彬,薛磊,王晨,聂黎明..基于数据依赖的跨架构二进制代码相似性分析[J].计算机技术与发展,2024,34(7):62-68,7.基金项目
国家自然科学基金(62002306,61972359) (62002306,61972359)