| 注册
首页|期刊导航|计算机技术与发展|基于数据依赖的跨架构二进制代码相似性分析

基于数据依赖的跨架构二进制代码相似性分析

张际灿 姚锟彬 薛磊 王晨 聂黎明

计算机技术与发展2024,Vol.34Issue(7):62-68,7.
计算机技术与发展2024,Vol.34Issue(7):62-68,7.DOI:10.20165/j.cnki.ISSN1673-629X.2024.0103

基于数据依赖的跨架构二进制代码相似性分析

Cross-architecture Binary Code Similarity Analysis Based on Data Dependencies

张际灿 1姚锟彬 2薛磊 2王晨 1聂黎明3

作者信息

  • 1. 武汉邮电科学研究院,湖北 武汉 430074
  • 2. 中山大学·深圳,广东 深圳 518107
  • 3. 浙江理工大学,浙江 杭州 310018||南洋理工大学,新加坡 699010
  • 折叠

摘要

Abstract

Binary Code Similarity Detection(BCSD)technology plays a pivotal role in various academic applications such as reverse en-gineering,vulnerability detection,malware analysis,software plagiarism,and patch analysis.Most research efforts have predominantly focused on control-flow embedding of binary functions and the exploration of underlying code embedding techniques utilizing Natural Language Processing(NLP)technology.However,it is worth noting that functions encompass not only control-flow information but also data-flow semantic information during their actual execution.Consequently,achieving a comprehensive abstraction of the semantic features of functions becomes crucial.In light of this,we introduce BS-DD,a framework for assessing binary function similarity that in-tegrates both control flow and data dependency relationships.We extract semantic information by simulating the execution of binary code and employ a simplification algorithm to construct a data dependency graph.Finally,we leverage graph neural networks for similarity as-sessment.We compile seven widely used software packages from the open-source community in various combinations and design three distinct task scenarios,including real-world vulnerability detection experiments,to compare the performance of the BS-DD approach with the latest data-flow-based BCSD methods.Experimental results demonstrate significant improvements in recall and Mean Reciprocal Rank(MRR)scores for such model.In real-world vulnerability detection scenarios,such model consistently outperforms other methods.

关键词

二进制/数据依赖/相似性检测/图神经网络/语义信息/漏洞检测

Key words

binary/data dependency/similarity detection/graph neural network/semantic information/vulnerability detection

分类

信息技术与安全科学

引用本文复制引用

张际灿,姚锟彬,薛磊,王晨,聂黎明..基于数据依赖的跨架构二进制代码相似性分析[J].计算机技术与发展,2024,34(7):62-68,7.

基金项目

国家自然科学基金(62002306,61972359) (62002306,61972359)

计算机技术与发展

OACSTPCD

1673-629X

访问量0
|
下载量0
段落导航相关论文