首页|期刊导航|电子科技大学学报|针对文本分类模型的高效硬标签对抗攻击方法

针对文本分类模型的高效硬标签对抗攻击方法

邱士林刘启和周世杰曾逸

电子科技大学学报2026，Vol.55Issue(1)：116-128,13.

电子科技大学学报2026，Vol.55Issue(1)：116-128,13.DOI:10.12178/1001-0548.2024295

针对文本分类模型的高效硬标签对抗攻击方法

Efficient hard-label adversarial attacks against natural language processing models

邱士林 ¹刘启和 ¹周世杰 ¹曾逸¹

作者信息

1. 电子科技大学信息与软件工程学院,成都 610054
折叠

摘要

Abstract

Due to the necessity of verifying the robustness of natural language processing models against adversarial attacks in real-world application scenarios,black-box adversarial attack techniques under the hard-label setting have garnered increasing attention.However,due to the discrete nature of textual data,the limited information feedback from the victim model,and the constraints on the number of queries imposed by practical applications,existing hard-label adversarial attack methods usually suffer from excessive queries to the victim model and low semantic consistency of generated adversarial texts,rendering them inadequate for real-world applications.To this end,an efficient hard label adversarial attack method is proposed.In this method,an attention mechanism is introduced in the initialization stage of the adversarial text,while in the adversarial text semantic optimization stage,two strategies are proposed:the semantic clustering-based synonym search and the semantic gradient-based dynamic expansion synonym search.Experimental results demonstrate that the proposed method can efficiently generate high-quality adversarial text with high semantic consistency and natural fluency with a small number of queries.

关键词

对抗攻击/对抗样本/鲁棒性/自然语言处理/人工智能

Key words

adversarial attack/adversarial example/robustness/natural language processing/artificial intelligence

分类

信息技术与安全科学

引用本文复制引用

邱士林,刘启和,周世杰,曾逸..针对文本分类模型的高效硬标签对抗攻击方法[J].电子科技大学学报,2026,55(1):116-128,13.

基金项目

国家自然科学基金(62272089) （62272089）

电子科技大学学报

ISSN：1001-0548

访问量0

下载量0

段落导航