飞控与探测2025,Vol.8Issue(2):9-17,9.DOI:10.20249/j.cnki.2096-5974.2025.02.002
基于强化学习的多无人机避障编队控制
Reinforcement-Learning-Based Multi-UAV Formation Control with Collision Avoidance
摘要
Abstract
This paper proposes a reinforcement learning-based method to address the flight control problem for multi-UAV formation.The proposed approach integrates curriculum learning,the leader-follower model,and the proximal policy optimization(PPO)method.Firstly,based on curriculum learning,the complex formation control task is decomposed into two learning stages.In the first stage,the leader UAV is trained using the PPO method to fly along a preset trajectory.Subsequently,the follower UAVs are trained in the second stage.During this stage,the control policy of the leader UAV is fixed as the neural network obtained in the first stage,and the delayed position of the leader is used as the tracking target for the followers.Based on partially observable information,the reward function was elaborately designed to guide the UAVs to maintain a stable linear formation during flight.To validate the effectiveness of the proposed method,we conducted simulations of a four-UAV formation performing a complex"8"shaped flight pattern in a three-di-mensional space using the Unity software.The results demonstrate that,compared to traditional control methods,our approach enables agents to learn effective strategies through interaction with the environment via a relatively simple training process,without the need to establish precise mathematical models.This method simplifies the complexity of formation control and provides a novel solution for UAV formation control.关键词
多智能体强化学习/编队控制/近端策略优化/课程学习/Leader-Follower模型/无人机飞行控制Key words
multi-agent reinforcement learning/formation control/proximal policy optimization/curriculum learning/leader-follower model/UAV control分类
信息技术与安全科学引用本文复制引用
吴婷,叶林奇,杨君,芦维宁..基于强化学习的多无人机避障编队控制[J].飞控与探测,2025,8(2):9-17,9.基金项目
国家自然科学基金(62225308,62003188) (62225308,62003188)