| 注册
首页|期刊导航|弹道学报|基于TD3的高超声速再入弹道动态规划方法

基于TD3的高超声速再入弹道动态规划方法

张瀚 王正强 王立峰 王德昊 张勋

弹道学报2025,Vol.37Issue(4):48-56,9.
弹道学报2025,Vol.37Issue(4):48-56,9.DOI:10.12115/ddxb.2025.10013

基于TD3的高超声速再入弹道动态规划方法

Dynamic Trajectory Planning for Hypersonic Reentry Based on TD3

张瀚 1王正强 2王立峰 2王德昊 1张勋1

作者信息

  • 1. 哈尔滨工程大学 智能科学与工程学院,山东 青岛 266500
  • 2. 云南省红外探测技术应用重点实验室,云南 昆明 650217
  • 折叠

摘要

Abstract

Hypersonic vehicles are required to simultaneously satisfy physical constraints such as heat flux,dynamic pressure,and overload limitations,as well as mission constraints including no-fly zones and evading interceptors.Conventional trajectory planning methods struggle to perform real-time planning and decision-making under such multi-constraint conditions within millisecond-level timeframes.To overcome these challenges,a dynamic trajectory planning method based on the twin delayed deep deterministic policy gradient(TD3)algorithm is introduced in this paper.The reentry trajectory planning task is modeled as a Markov decision process,in which well-designed state space,action space,and a composite reward function incorporating multiple constraints guide the agent to autonomously learn an optimal control policy.The designed reward function balances path constraints(e.g.,heat flux,dynamic pressure,and overload)and mission objectives(e.g.,no-fly zone avoidance and interceptor evasion).Simulation results show that the proposed approach satisfies all path constraints while accurately reaching the target position,significantly enhancing both robustness and autonomous decision-making capability of the reentry vehicle.Compared with the soft actor-critic algorithm,TD3 demonstrates superior performance in terms of smoother control commands,higher training efficiency,and more stable policy convergence,offering a practical and effective solution for intelligent guidance of hypersonic vehicles in complex interception scenarios.

关键词

高超声速飞行器/再入/弹道规划/深度强化学习/TD3算法/多约束优化/突防

Key words

hypersonic vehicle/reentry/trajectory planning/deep reinforcement learning/TD3 algorithm/multi-constraint optimization/penetration

分类

军事科技

引用本文复制引用

张瀚,王正强,王立峰,王德昊,张勋..基于TD3的高超声速再入弹道动态规划方法[J].弹道学报,2025,37(4):48-56,9.

弹道学报

OA北大核心

1004-499X

访问量0
|
下载量0
段落导航相关论文