计算机工程2025,Vol.51Issue(5):73-82,10.DOI:10.19678/j.issn.1000-3428.0069850
基于融合课程思想MADDPG的无人机编队控制
Unmanned Aerial Vehicle Formation Control Based on MADDPG with Integrated Curriculum Learning
摘要
Abstract
The Multi-Agent Deep Deterministic Policy Gradient(MADDPG)algorithm is an extension of the Deep Deterministic Policy Gradient(DDPG)algorithm,specifically designed for multi-agent environments.In the MADDPG algorithm,each agent considers not only its own observations and actions but also the strategies of other agents to make more accurate collective decisions.This design significantly improves performance and stability in complex and changing environments.Based on the MADDPG algorithm framework,this study addressed the problem of Unmanned Aerial Vehicle(UAV)formation control.To overcome the challenge of convergence difficulty in multi-agent algorithms,a curriculum reinforcement learning approach was employed to train tasks in a stagewise manner.Progressively enhanced reward functions were designed for different tasks of each stage,and dense rewards were devised using the artificial potential field concept to significantly reduce the training difficulty.The effectiveness and stability of the MADDPG algorithm in multi-agent environments were demonstrated through ablation and control experiments performed in a self-built Software in the Loop(SITL)simulation environment.Furthermore,real-world experiments were conducted to verify the practicality of the designed algorithm.关键词
无人机编队/深度强化学习/多智能体深度确定性策略梯度/课程学习/神经网络Key words
Unmanned Aerial Vehicle(UAV)formation/deep reinforcement learning/Multi-Agent Deep Deterministic Policy Gradient(MADDPG)/curriculum learning/neural network分类
计算机与自动化引用本文复制引用
吴凯峰,刘磊,刘晨,梁成庆..基于融合课程思想MADDPG的无人机编队控制[J].计算机工程,2025,51(5):73-82,10.基金项目
河北省自然科学基金面上项目(A2023209002). (A2023209002)