南京大学学报(自然科学版)2024,Vol.60Issue(6):900-907,8.DOI:10.13232/j.cnki.jnju.2024.06.002
基于自动稀释的文本对抗攻击强化方法
Text adversarial attack capability enhancement method based on automatic dilution
摘要
Abstract
Using adversarial examples for training enhances the robustness of deep neural networks.Therefore,improving the success rate of adversarial attacks is significant in the field of adversarial example research.Diluting original samples can bring them closer to the decision boundary of the model,thereby increasing the success rate of adversarial attacks.However,existing dilution algorithms suffer from issues such as reliance on manually generated dilution pools and single dilution targets.This paper proposes a method to enhance the capability of text adversarial attacks based on automatic dilution,called the Automatic Multi-positional Dilution Preprocessing(AMDP)algorithm.The AMDP algorithm eliminates the reliance on manual assistance in the dilution process and generates different dilution pools for different datasets and target models.Additionally,AMDP extends the targeted words for dilution,broadening the search space of dilution operations.As an input transformation method,AMDP can be combined with other adversarial attack algorithms to further enhance attack performance.Experimental results demonstrate that AMDP increases the success rate by approximately 10%on average on BERT,WordCNN,and WordLSTM classification models,while reducing the average modification rate of original samples and the average number of accesses to the target model.关键词
对抗机器学习/对抗样本/文本稀释/分类边界/自然语言处理Key words
adversarial machine learning/adversarial samples/text dilution/classification boundaries/natural language processing分类
天文与地球科学引用本文复制引用
房钰深,陈振华,何琨..基于自动稀释的文本对抗攻击强化方法[J].南京大学学报(自然科学版),2024,60(6):900-907,8.基金项目
国家自然科学基金(62076105,U22B2017) (62076105,U22B2017)