首页|期刊导航|南通大学学报（自然科学版）|基于CNN-LSTM融合网络的深度强化学习多无人机自主避碰

基于CNN-LSTM融合网络的深度强化学习多无人机自主避碰

梁成庆李蕾刘磊

南通大学学报（自然科学版）2025，Vol.24Issue(4)：1-9,20,10.

南通大学学报（自然科学版）2025，Vol.24Issue(4)：1-9,20,10.DOI:10.12194/j.ntu.20241112003

基于CNN-LSTM融合网络的深度强化学习多无人机自主避碰

Multi-UAV autonomous collision avoidance via deep reinforcement learning with CNN-LSTM fusion networks

梁成庆 ¹李蕾 ²刘磊²

作者信息

1. 河海大学人工智能与自动化学院,江苏常州 213200||河海大学海上智能网信技术教育部重点实验室,江苏常州 213200
2. 河海大学数学学院,江苏南京 211100
折叠

摘要

Abstract

Aiming at the autonomous collision avoidance problem of multi-unmanned aerial vehicles(multi-UAVs)in complex environments,the proximal policy optimization(PPO)algorithm is adopted to investigate the autonomous collision avoidance strategy for multi-UAVs.Firstly,to address the problem of poor coordination during multi-UAVs collision avoidance,a CNN-LSTM fusion network incorporating convolutional neural network(CNN)and long short-term memory(LSTM)network is designed by introducing LSTM to construct memory functionality.The fusion net-work fully utilizes the capability of CNN in feature extraction and the advantages of LSTM in processing sequential data,thereby improving the robustness of UAV decision-making.Secondly,a novel reward function is designed based on artificial potential field reward shaping technique.By designing a main reward combined with several auxiliary re-wards,reward sparsity is effectively avoided.The mainline reward guides the UAV toward the predetermined target,while the auxiliary rewards encourage the UAV to take more flexible and robust actions,thus achieving more effective autonomous collision avoidance in complex environments.Finally,the algorithm is validated through simulations in both obstacle-free and obstacle environments.Simulation results demonstrate that the CLPPO algorithm based on CL fusion network effectively improves algorithm performance.In addition,comparative experiments and robustness tests under high-density UAV environments are conducted.The test results show that the algorithm exhibits excellent per-formance in terms of return and collision avoidance success rate.

关键词

多无人机系统/深度强化学习/CNN-LSTM融合网络/自主避碰策略/奖励塑形

Key words

multi-UAV system/deep reinforcement learning/CNN-LSTM fusion network/autonomous collision avoidance strategy/reward shaping

分类

信息技术与安全科学

引用本文复制引用

梁成庆,李蕾,刘磊..基于CNN-LSTM融合网络的深度强化学习多无人机自主避碰[J].南通大学学报（自然科学版）,2025,24(4):1-9,20,10.

基金项目

江苏省研究生科研与实践创新计划项目(KYCX24-0836) （KYCX24-0836）

教育部重点实验室开放基金项目(Scip20240111) （Scip20240111）

中央高校基本科研业务费项目(B240203012) （B240203012）

南通大学学报（自然科学版）

ISSN：1673-2340

访问量1

下载量0

段落导航