多无人机系统在线强化学习最优安全跟踪控制OA
Optimal Secure Tracking Control in Multi-UAVs Based on Online Reinforcement Learning
在无人机(UAV)编队跟踪任务中,虚假数据注入(FDI)攻击者可向控制指令注入误导性数据,导致无人机无法形成指定的编队构型,故需设计安全编队跟踪控制器.为此,本文利用零和图博弈对攻防过程进行建模,其中FDI攻击者和安全控制器是博弈的参与者,攻击者的目标是最大化设定的成本函数,而安全控制器的目标与之相反,求解博弈并获得最优安全控制策略依赖于求取Hamilton-Jacobi-Isaacs(HJI)方程的解.而HJI方程是耦合偏微分方程,难以直接求解,因此结合经验回放机制引入了有限时间收敛的在线强化学习算法,设计了单评价神经网络近似值函数并获得了最优安全控制策略.最终利用仿真验证了算法的有效性.
In Unmanned Aerial Vehicle(UAV)formation tracking missions,False Data Injection(FDI)attackers can inject misleading data into the control commands,resulting in the fact that UAVs can not form the specified formation configuration,so there is a need to design a secure formation tracking controller.The attack-defense process was modeled as a zero-sum graphical game,in which the FDI attacker and the secure controller were viewed as game players.The attacker aims to maximize the cost function yet the secure controller serves a contrary purpose.Solving the game and acquiring the optimal secure control policy rely on solving the Hamilton-Jacobi-Isaacs(HJI)equation.The HJI equation is a coupled partial differential equation,which is difficult to solve directly.Therefore,the finite-time convergent online reinforcement learning algorithm that combines the experience replay mechanism was introduced and the critic-only neural network was utilized to approximate the value function for obtaining the optimal secure control policy.A numerical simulation was given to show the effectiveness of the raised scheme.
弓镇宇;杨飞生
西北工业大学,陕西 西安 710072
FDI攻击多无人机在线强化学习优化控制零和图博弈
FDI attackmulti-UAVsonline reinforcement learningoptimal controlzero-sum graphical game
《航空科学技术》 2024 (004)
25-30 / 6
国家自然科学基金(62073269);航空科学基金(2020Z034053002);陕西省重点研发计划项目(2022GY-244);重庆市自然科学基金(CSTB2022NSCQ-MSX0963);广东省基础与应用基础研究基金(2023A1515011220) National Natural Science Foundation of China (62073269); Aeronautical Science Foundation of China(2020Z034053002); Key Research and Development Program of Shaanxi (2022GY-244); Natural Science Foundation of Chongqing(CSTB2022NSCQ-MSX0963); Guangdong Basic and Applied Basic Research Foundation(2023A1515011220)
评论