中北大学学报(自然科学版)2025,Vol.46Issue(1):105-115,11.DOI:10.62756/jnuc.issn.1673-3193.2023.10.0030
基于程序语义与度量的代码缺陷检测
Code Defect Detection Based on Program Semantics and Metrics
摘要
Abstract
Code defects in software seriously affect the experience and security of software users.Traditional code defect detection methods have the problem of low accuracy,while the existing methods combined with deep learning have coarse detection granularity and less than ideal detection effect.For this reason,this paper proposed a code defect detection method based on program semantics and metrics.A point-of-interest detection algorithm for code defects based on LLVM IR was designed,which used SymPas,a lightweight symbolic program slicing tool,to obtain program slices related to defective points of interest.The program slices code fragments were transformed into vector representations by a pre-trained model,and the instruction-level slicing metric,cognitive complexity metric,was fused to deeply analyze the relationships and features between the sliced statements.A hybrid model ResCNN-GRU was constructed for training to effectively fuse and learn the extracted features.The experimental results show that this paper refines the granularity of vulnerability detection by using symbolic program slicing technique,the fused semantic and metric information under the intermediate representation LLVM IR can better represent the relationships and features between the defective code statements,and the constructed hybrid model solves the time-series problem as well as the unbalanced number of samples problem to a certain extent,and comparing with several advanced methods,the accuracy of this paper's method reaches 94.1%.关键词
预训练模型/程序切片/切片认知域/残差网络/卷积神经网络/门控制神经网络Key words
pre-training model/program slicing/slice cognitive domain/residual network/convolution neural network/gate control neural network分类
信息技术与安全科学引用本文复制引用
卢跃,嵇友晴,周礼亮,吕青,张迎周..基于程序语义与度量的代码缺陷检测[J].中北大学学报(自然科学版),2025,46(1):105-115,11.基金项目
国家自然科学基金资助项目(62272214) (62272214)