上海航天(中英文)2024,Vol.41Issue(6):39-45,7.DOI:10.19328/j.cnki.2096-8655.2024.06.005
跨域拦截弹数据驱动的在线强化学习姿态控制方法
Data-driven Online Reinforcement Learning Attitude Control Method for Cross-domain Interceptors
摘要
Abstract
In order to solve the problem that it is difficult to conduct dynamic modeling for cross-domain interceptors flying in wide-speed and large-space domains and there are no relevant models,a data-driven online reinforcement learning attitude control method is proposed.First,inspired by the zero-sum game,the interference is considered as a part of the system input to design the performance index function.The purpose of the actual interceptor control quantity input is to minimize the performance index function and improve the system performance,while the effect of interference is opposite.Then,the corresponding approximate solution is obtained through online learning by constructing a critic network,and the uncertainty is handled by updating the weights dynamically.Different from the traditional model-based online reinforcement learning solution method,the data-driven reinforcement learning(RL data driven)method no longer requires the dynamic model information of the interceptor system,but only uses the input and output data of the system to drive the network online learning and updating of weights.Finally,the effectiveness of the proposed method is verified by simulation.关键词
跨域拦截弹/在线强化学习/数据驱动/零和博弈/不确定性Key words
cross-domain interceptor/online reinforcement learning/data-driven/zero-sum game/uncertainty分类
航空航天引用本文复制引用
赵琛钰,胥彪,宋勋,赵启伦,李爽..跨域拦截弹数据驱动的在线强化学习姿态控制方法[J].上海航天(中英文),2024,41(6):39-45,7.基金项目
空间光电探测与感知工业和信息化部重点实验室基金资助(NJ2022025-05) (NJ2022025-05)