摘要
Abstract
Hypersonic vehicles are required to simultaneously satisfy physical constraints such as heat flux,dynamic pressure,and overload limitations,as well as mission constraints including no-fly zones and evading interceptors.Conventional trajectory planning methods struggle to perform real-time planning and decision-making under such multi-constraint conditions within millisecond-level timeframes.To overcome these challenges,a dynamic trajectory planning method based on the twin delayed deep deterministic policy gradient(TD3)algorithm is introduced in this paper.The reentry trajectory planning task is modeled as a Markov decision process,in which well-designed state space,action space,and a composite reward function incorporating multiple constraints guide the agent to autonomously learn an optimal control policy.The designed reward function balances path constraints(e.g.,heat flux,dynamic pressure,and overload)and mission objectives(e.g.,no-fly zone avoidance and interceptor evasion).Simulation results show that the proposed approach satisfies all path constraints while accurately reaching the target position,significantly enhancing both robustness and autonomous decision-making capability of the reentry vehicle.Compared with the soft actor-critic algorithm,TD3 demonstrates superior performance in terms of smoother control commands,higher training efficiency,and more stable policy convergence,offering a practical and effective solution for intelligent guidance of hypersonic vehicles in complex interception scenarios.关键词
高超声速飞行器/再入/弹道规划/深度强化学习/TD3算法/多约束优化/突防Key words
hypersonic vehicle/reentry/trajectory planning/deep reinforcement learning/TD3 algorithm/multi-constraint optimization/penetration分类
军事科技