首页|期刊导航|信息安全研究|基于可解释性的不可见后门攻击研究

基于可解释性的不可见后门攻击研究

郑嘉熙陈伟尹萍张怡婷

信息安全研究2025，Vol.11Issue(1)：21-27,7.

信息安全研究2025，Vol.11Issue(1)：21-27,7.DOI:10.12379/j.issn.2096-1057.2025.01.04

基于可解释性的不可见后门攻击研究

Research of Invisible Backdoor Attack Based on Interpretability

郑嘉熙 ¹陈伟 ¹尹萍 ¹张怡婷¹

作者信息

1. 南京邮电大学计算机学院、软件学院、网络空间安全学院南京 210023
折叠

摘要

Abstract

Deep learning has achieved remarkable success on a variety of critical tasks.However,recent work has shown that deep neural networks are vulnerable to backdoor attacks,where attackers release inverse models that behave normally on benign samples,but misclassify samples imposed by any trigger to the target label.Unlike adversarial samples,backdoor attacks are mainly implemented in the model training phase,perturbing samples with triggers and injecting backdoors into the model.This paper proposes an invisible backdoor attack based on interpretability algorithms.Different from the existing works that arbitrarily set the trigger mask,this paper carefully designs a trigger mask determination based on interpretability,and uses the latest random pixel perturbation as the trigger style design,so that the sample pairs imposed by the trigger are more natural and undetectable to avoid the detection of the human eye,and the defense strategy against the backdoor attack.In this paper,we conduct a large number of comparative experiments on CIFAR-10,CIFAR-100 and ImageNet datasets to demonstrate the effectiveness and superiority of our attack.The SSIM index is also used to evaluate the difference between the backdoor samples designed in this paper and the benign samples,and an evaluation index close to 0.99 is obtained,which proves that the backdoor samples generated in this paper are not identifiable under visual inspection.Finally,this paper also proves that the proposed attack is defensible against the existing backdoor defense methods.

关键词

深度学习/深度神经网络/后门攻击/触发器/可解释性/后门样本

Key words

deep learning/deep neural network/backdoor attack/trigger/interpretability/backdoor sample

分类

信息技术与安全科学

引用本文复制引用

郑嘉熙,陈伟,尹萍,张怡婷..基于可解释性的不可见后门攻击研究[J].信息安全研究,2025,11(1):21-27,7.

基金项目

江苏省重点研发项目(BE2022065-5) （BE2022065-5）

江苏省网络与信息安全重点实验室项目(BM2003201) （BM2003201）

信息安全研究

OA北大核心

ISSN：2096-1057

访问量2

下载量0

段落导航