南通大学学报(自然科学版)2025,Vol.24Issue(4):1-9,20,10.DOI:10.12194/j.ntu.20241112003
基于CNN-LSTM融合网络的深度强化学习多无人机自主避碰
Multi-UAV autonomous collision avoidance via deep reinforcement learning with CNN-LSTM fusion networks
摘要
Abstract
Aiming at the autonomous collision avoidance problem of multi-unmanned aerial vehicles(multi-UAVs)in complex environments,the proximal policy optimization(PPO)algorithm is adopted to investigate the autonomous collision avoidance strategy for multi-UAVs.Firstly,to address the problem of poor coordination during multi-UAVs collision avoidance,a CNN-LSTM fusion network incorporating convolutional neural network(CNN)and long short-term memory(LSTM)network is designed by introducing LSTM to construct memory functionality.The fusion net-work fully utilizes the capability of CNN in feature extraction and the advantages of LSTM in processing sequential data,thereby improving the robustness of UAV decision-making.Secondly,a novel reward function is designed based on artificial potential field reward shaping technique.By designing a main reward combined with several auxiliary re-wards,reward sparsity is effectively avoided.The mainline reward guides the UAV toward the predetermined target,while the auxiliary rewards encourage the UAV to take more flexible and robust actions,thus achieving more effective autonomous collision avoidance in complex environments.Finally,the algorithm is validated through simulations in both obstacle-free and obstacle environments.Simulation results demonstrate that the CLPPO algorithm based on CL fusion network effectively improves algorithm performance.In addition,comparative experiments and robustness tests under high-density UAV environments are conducted.The test results show that the algorithm exhibits excellent per-formance in terms of return and collision avoidance success rate.关键词
多无人机系统/深度强化学习/CNN-LSTM融合网络/自主避碰策略/奖励塑形Key words
multi-UAV system/deep reinforcement learning/CNN-LSTM fusion network/autonomous collision avoidance strategy/reward shaping分类
信息技术与安全科学引用本文复制引用
梁成庆,李蕾,刘磊..基于CNN-LSTM融合网络的深度强化学习多无人机自主避碰[J].南通大学学报(自然科学版),2025,24(4):1-9,20,10.基金项目
江苏省研究生科研与实践创新计划项目(KYCX24-0836) (KYCX24-0836)
教育部重点实验室开放基金项目(Scip20240111) (Scip20240111)
中央高校基本科研业务费项目(B240203012) (B240203012)