首页|期刊导航|弹道学报|基于MADDPG的再入飞行器协同制导方法

基于MADDPG的再入飞行器协同制导方法

王嘉磊郭建国

弹道学报2025，Vol.37Issue(4)：30-37,47,9.

弹道学报2025，Vol.37Issue(4)：30-37,47,9.DOI:10.12115/ddxb.2025.10006

基于MADDPG的再入飞行器协同制导方法

Cooperative Guidance Method for Reentry Vehicles Based on MADDPG

王嘉磊 ¹郭建国¹

作者信息

1. 西北工业大学航天学院,陕西西安 710072
折叠

摘要

Abstract

Cooperative guidance for multiple vehicles during the near-space reentry phase is challenged by strong aerodynamic coupling,pronounced nonlinear dynamics,and stringent mission and threat constraints.Traditional guidance methods,which typically rely on analytical formulations or single-agent optimization strategies,exhibit limitations in real-time decision-making,constraint handling,and cooperative capability,making them insufficient for future high-dynamic swarm engagement scenarios.To address these issues,a master-slave cooperative guidance framework based on the multi-agent deep deterministic policy gradient(MADDPG)algorithm was proposed in this paper.Firstly,a relative dynamics model between the master and slave vehicles was constructed in the line-of-sight(LOS)coordinate system,providing a theoretical foundation for the modeling of cooperative formation control among mutiple vehicles.Secondly,to enhance policy learning under multi-constraint conditions,a composite reward function was designed using LOS rate,relative distance-keeping error,and formation deviation as core indicators.A radar-threat penalty term was incorporated to achieve unified representation of formation maintenance,terminal mission requirements,and threat avoidance.Furthermore,a centralized training-decentralized execution paradigm was adopted,in which a residual network architecture was incorporated to facilitate policy learning and training for the master-slave vehicles,thereby enabling effective learning of cooperative strategies and achieving multi-vehicle coordinated control.Simulation results demonstrate that the proposed method significantly outperforms traditional guidance strategies in control accuracy,stability and computational efficiency.The learned policies maintain reliable formation-following of the slave vehicles with respect to the master vehicle under highly dynamic reentry conditions,substantially reducing relative distance errors and line-of-sight jitter while effectively avoiding radar threat zones.In summary,the proposed approach provides a scalable,intelligent,and highly reliable solution for cooperative guidance of multiple vehicles in near-space reentry missions,enhancing the overall stability and decision-making capability of multi-vehicle coordination.

关键词

多飞行器编队/MADDPG算法/再入段/协同制导

Key words

multi-aircraft formation/MADDPG algorithm/reentry phase/cooperative guidance

分类

军事科技

引用本文复制引用

王嘉磊,郭建国..基于MADDPG的再入飞行器协同制导方法[J].弹道学报,2025,37(4):30-37,47,9.

基金项目

国家自然科学基金(52472419) （52472419）

弹道学报

OA北大核心

ISSN：1004-499X

访问量2

下载量0

段落导航