一种多模型的调度优化对抗攻击算法OA北大核心CSTPCD
Adversarial Attack Algorithm Based on Multi-model Scheduling Optimization
对抗样本可通过单模型和集成模型这2种方式生成,其中集成模型生成的对抗样本往往具有更强的攻击成功率.目前集成模型的相关研究较少,现有的集成模型方式大多是在迭代中同时使用所有模型,没有合理考虑不同模型的差异问题,导致集成模型生成的对抗样本攻击成功率较低.为了进一步提高集成模型的攻击成功率,提出一种多模型的调度优化对抗攻击算法.首先通过计算各个模型的损失梯度差异进行模型的调度选择,在每轮迭代选择最优模型组合进行集成攻击得到最优梯度.其次使用前一阶段的动量项更新当前数据点,在更新后的数据点上使用当前阶段模型组合计算得到优化梯度.利用优化梯度结合变换梯度来调整得到最终梯度方向.在ImageNet数据集进行大量实验,结果表明:所提的集成算法以更少扰动得到更高的黑盒攻击成功率.与主流的全模型集成方法对比,黑盒攻击正常训练模型和经过对抗训练模型的平均成功率分别提高了3.4%和12%,且生成的对抗样本有更好的视觉效果.
Adversarial samples can be generated in two approaches:Single model and model ensemble.Adversarial samples generated through model ensemble often exhibit higher attack success rates.However,there are few related studies on model ensemble,and most of the existing model ensemble methods are based on all models being used simultaneously in the iteration without reasonable consideration of the differences between different models,resulting in a lower attack success rates of adversarial attack.To further enhance the success rate of model ensemble,this paper proposes an adversarial attack algorithm based on multi-model scheduling optimization.Firstly,the model scheduling is performed by calculating the difference of the loss gradient of each model.Then,the optimal model combination is selected in each iteration round to conduct a model ensemble attack,thereby obtaining the optimal gradient.Subsequently,the momentum item of the previous stage is utilized to update the current data point.The optimized gradient is calculated by using the model combination of the current stage on the updated data point.Finally,the optimized gradient combined with the transformed gradient is used to adjust the final gradient direction.Experimental results on the ImageNet dataset demonstrate that the proposed integrated algorithm achieves a higher black-box attack success rate with less perturbation.Compared with mainstream full-model ensemble attack,the average success rates of black-box attacks on normal training modelshave increased by 3.4%and 12%,respectively.Additionally,the generated adversarial samples exhibit better visual quality.
王永;柳毅
广东工业大学计算机学院 广州 510006
计算机与自动化
对抗样本神经网络深度学习黑盒攻击集成模型
adversarial examplesneural networkdeep learningblack-box attackmodel ensemble
《信息安全研究》 2024 (005)
403-410 / 8
广东省重点领域研发计划项目(2021B0101200002)
评论