| 注册
首页|期刊导航|信息安全研究|小样本语义分析的漏洞实体抽取方法

小样本语义分析的漏洞实体抽取方法

丁全 张磊 黄帅 查正朋 陶陶

信息安全研究2025,Vol.11Issue(3):265-274,10.
信息安全研究2025,Vol.11Issue(3):265-274,10.DOI:10.12379/j.issn.2096-1057.2025.03.09

小样本语义分析的漏洞实体抽取方法

A Method for Extracting Vulnerable Entities in Small Sample Semantic Analysis

丁全 1张磊 2黄帅 3查正朋 3陶陶4

作者信息

  • 1. 国网安徽省电力有限公司电力科学研究院 合肥 230601
  • 2. 中国科学技术大学信息科学技术学院 合肥 230026
  • 3. 中国科学技术大学先进技术研究院 合肥 230031
  • 4. 安徽工业大学计算机科学与技术学院 安徽马鞍山 243032
  • 折叠

摘要

Abstract

At the moment,different information security vulnerability databases have different standards,with different focuses on vulnerability data and relatively independent relationships.It is difficult to quickly and comprehensively obtain high-value vulnerability information,and a unified vulnerability entity standard needs to be established.Therefore,this paper focuses on vulnerability data in entity extraction technology research.The majority of vulnerability data is provided in unstructured natural language form that combines Chinese and English,rule-based methods lack robust generalization,deep-learning-based methods occupy too many resources and rely on a large amount of annotated data.To address these issues,this paper presents a vulnerability entity extraction method with small sample semantic analysis.The method employs BERT pre-trained vulnerability data to generate a pre-trained model within the cybersecurity vulnerability domain,allowing for a better understanding of cybersecurity vulnerability data and reducing reliance on lager annotated data.Additionally,a self-supervised incremental learning approach is applied to improve model performance with very limited annotated data(1785 samples).The model in this paper extracts 12 types of vulnerability entities in the field of cybersecurity,and the experimental results show that the method outperforms other models in the recognition and extraction of cybersecurity vulnerability entities,with an F1 value of 0.864 3.

关键词

小样本/语义分析/漏洞实体抽取/BERT/CRF

Key words

small sample/semantic analysis/vulnerability entity extraction/BERT/CRF

分类

计算机与自动化

引用本文复制引用

丁全,张磊,黄帅,查正朋,陶陶..小样本语义分析的漏洞实体抽取方法[J].信息安全研究,2025,11(3):265-274,10.

基金项目

安徽省高校协同创新项目(GXXT-2023-021) (GXXT-2023-021)

信息安全研究

OA北大核心

2096-1057

访问量2
|
下载量0
段落导航相关论文