| 注册
首页|期刊导航|华南理工大学学报(自然科学版)|一种基于路径表示和预训练模型的软件漏洞检测方法

一种基于路径表示和预训练模型的软件漏洞检测方法

陆璐 万童

华南理工大学学报(自然科学版)2025,Vol.53Issue(5):56-65,10.
华南理工大学学报(自然科学版)2025,Vol.53Issue(5):56-65,10.DOI:10.12141/j.issn.1000-565X.240324

一种基于路径表示和预训练模型的软件漏洞检测方法

A Method for Software Vulnerability Detection via Path Representations and Pretrained Model

陆璐 1万童2

作者信息

  • 1. 华南理工大学 计算机科学与工程学院,广东 广州 510006||鹏城实验室,广东 深圳 518000
  • 2. 华南理工大学 计算机科学与工程学院,广东 广州 510006
  • 折叠

摘要

Abstract

Software vulnerabilities are critical weaknesses that compromise the security of computer systems,making them susceptible to attacks may lead to data breaches,system crashes or even more severe security incidents.Therefore,accurately and efficiently detecting software vulnerabilities has become a central research focus in the field of computer security.Although contemporary deep learning-based vulnerability detection approaches have made progress,they are often limited by single code representations and fail to fully capture the complementary nature of code semantics and structural information.This research introduces an innovative method for software vulnerability detection,termed VDPPM(Vulnerability Detection via Path Representations and Pretrained Model),which effectively enhances code semantic analysis and vulnerability detection accuracy.VDPPM integrates the path representations extracted from abstract syntax tree,control flow graph and program dependency graphs,leverages the SimCodeBERT model optimized through contrastive learning framework SimCSE to enhance the model's ability to capture vulnerability features.In the experiments,first,three types of code representations are extracted from the source code and are used to construct a corpus by deriving path representations for the training of Doc2vec model,thus generating general-purpose embedding models,converting path sequences into vector representations.Subsequently,a pretrained CodeBERT model is integrated,which,after being trained under the contrastive learning framework,gains increased precision in capturing deep semantic features within the code.Finally,by combining vector embeddings from Doc2vec and SimCodeBERT,high-quality code representations are constructed to perform vulnerability detection.Experimental results demonstrate that,across multiple publicly available benchmark datasets for vulnerability detection tasks,VDPPM outperforms the existing mainstream methods with significant improvements in several performance metrics.This convincingly validates the effectiveness and superiority of the proposed method.

关键词

软件漏洞/漏洞检测/路径表示/预训练/对比学习

Key words

software vulnerability/vulnerability detection/path representation/pre-training/contrastive learning

分类

信息技术与安全科学

引用本文复制引用

陆璐,万童..一种基于路径表示和预训练模型的软件漏洞检测方法[J].华南理工大学学报(自然科学版),2025,53(5):56-65,10.

基金项目

广东省重点领域研发计划项目(2022B0101070001) (2022B0101070001)

广东省自然科学基金项目(2024A1515010204) Supported by the Key Field Research and Development Plan of Guangdong Province(2022B0101070001)and the Natural Science Foundation of Guangdong Province(2024A1515010204) (2024A1515010204)

华南理工大学学报(自然科学版)

OA北大核心

1000-565X

访问量0
|
下载量0
段落导航相关论文