首页|期刊导航|飞控与探测|基于强化学习的多无人机避障编队控制

基于强化学习的多无人机避障编队控制

吴婷叶林奇杨君芦维宁

飞控与探测2025，Vol.8Issue(2)：9-17,9.

飞控与探测2025，Vol.8Issue(2)：9-17,9.DOI:10.20249/j.cnki.2096-5974.2025.02.002

基于强化学习的多无人机避障编队控制

Reinforcement-Learning-Based Multi-UAV Formation Control with Collision Avoidance

吴婷 ¹叶林奇 ¹杨君 ²芦维宁³

作者信息

1. 上海大学未来技术学院(人工智能研究院)·上海·200444
2. 清华大学自动化系·北京·100084
3. 清华大学北京信息科学与技术国家研究中心·北京·100084
折叠

摘要

Abstract

This paper proposes a reinforcement learning-based method to address the flight control problem for multi-UAV formation.The proposed approach integrates curriculum learning,the leader-follower model,and the proximal policy optimization(PPO)method.Firstly,based on curriculum learning,the complex formation control task is decomposed into two learning stages.In the first stage,the leader UAV is trained using the PPO method to fly along a preset trajectory.Subsequently,the follower UAVs are trained in the second stage.During this stage,the control policy of the leader UAV is fixed as the neural network obtained in the first stage,and the delayed position of the leader is used as the tracking target for the followers.Based on partially observable information,the reward function was elaborately designed to guide the UAVs to maintain a stable linear formation during flight.To validate the effectiveness of the proposed method,we conducted simulations of a four-UAV formation performing a complex"8"shaped flight pattern in a three-di-mensional space using the Unity software.The results demonstrate that,compared to traditional control methods,our approach enables agents to learn effective strategies through interaction with the environment via a relatively simple training process,without the need to establish precise mathematical models.This method simplifies the complexity of formation control and provides a novel solution for UAV formation control.

关键词

多智能体强化学习/编队控制/近端策略优化/课程学习/Leader-Follower模型/无人机飞行控制

Key words

multi-agent reinforcement learning/formation control/proximal policy optimization/curriculum learning/leader-follower model/UAV control

分类

信息技术与安全科学

引用本文复制引用

吴婷,叶林奇,杨君,芦维宁..基于强化学习的多无人机避障编队控制[J].飞控与探测,2025,8(2):9-17,9.

基金项目

国家自然科学基金(62225308,62003188) （62225308,62003188）

飞控与探测

ISSN：2096-5974

访问量5

下载量0

段落导航