网络与信息安全学报2025,Vol.11Issue(5):50-67,18.DOI:10.11959/j.issn.2096-109x.2025060
基于模仿学习的深度强化学习训练数据推断攻击
Membership inference attack based on imitation learning for deep reinforcement learning
摘要
Abstract
Deep reinforcement learning(DRL),owing to its remarkable performance advantages,had been widely applied in fields such as game intelligence,robotic control,and autonomous driving.This technology typically re-lied on large amounts of training data,and the agent's decision-making ability was improved through continuous trial and error.Previous studies had indicated that DRL faced privacy risks due to potential training data leakage;however,existing attack methods were generally found to be more effective against overfitted agents.To address this limitation,a trajectory-informed privacy attack strategy(TIPAS)based on imitation learning was proposed for DRL-trained agents with strong generalization ability.In this method,the agent's behavioral trajectories were ana-lyzed,and a shadow agent model was constructed using behavior cloning.The behavioral policy features generated by the shadow agent in multiple environments were then integrated through ensemble learning to train an attack model,which enabled accurate inference of private information from the target agent under black-box access to the training policy.Experiments were conducted across four network topology scenarios and four algorithmic models.The proposed method was compared against two baselines:a support vector machine utilizing reward mean and variance features,and a single multilayer perceptron model without ensemble learning.The comparison of infer-ence accuracy demonstrated the superiority of the proposed approach in attack performance.Experimental results show that excellent attack performance is achieved by the proposed method across four classic game environments on the Gym platform,even under defensive conditions where differential privacy and regularization techniques are employed.关键词
深度强化学习/模仿学习/成员推断攻击/隐私窃取攻击Key words
deep reinforcement learning/imitation learning/membership inference attack/privacy stealing attack分类
计算机与自动化引用本文复制引用
陈晋音,瞿康赟,郑海斌..基于模仿学习的深度强化学习训练数据推断攻击[J].网络与信息安全学报,2025,11(5):50-67,18.基金项目
国家自然科学基金(62072406,62406286) (62072406,62406286)
浙江省自然科学基金(LDQ23F020001) (LDQ23F020001)
浙江省重点研发计划(2022C01018) (2022C01018)
国家重点研发计划(2018AAA0100801) The National Natural Science Foundation of China(62072406,62406286),The Natural Science Foundation of Zhejiang Province(LDQ23F020001),Zhejiang Provincial Key R&D Program(2022C01018),The National Key R&D Pro-gram of China(2018AAA0100801) (2018AAA0100801)