软件导刊2025,Vol.24Issue(10):47-55,9.DOI:10.11907/rjdk.251048
基于深度逆强化学习的无人机实时动态避障航路规划算法
Deep Inverse Reinforcement Learning Based Real-time Dynamic Obstacle Avoidance Path Planning Algorithm for UAVs
摘要
Abstract
Aiming at the challenges of high sensitivity of traditional reinforcement learning algorithms to reward functions and insufficient training data when applying deep inverse reinforcement learning to UAV obstacle avoidance route planning,a real-time dynamic UAV obsta-cle avoidance route planning algorithm combining expert demonstrations and deep inverse reinforcement learning is proposed.Firstly,a policy network is trained based on the maximum entropy inverse reinforcement learning framework to learn safe and efficient flight strategies from ex-pert demonstrations.Entropy regularization is introduced to construct an adversarial learning system consisting of a reward network and a poli-cy network;Secondly,in order to handle continuous path points,a Long Short Term Memory(LSTM)layer is combined with traditional poli-cy networks to enhance the modeling ability of relationships between continuous path points;Finally,construct an environment that simulates the motion of dynamic obstacles,and combine the Informed-RRT*-DWA algorithm with feature aware expert behavior simulation method to generate expert demonstration trajectories.Experiments have shown that this algorithm can generate safe routes with path lengths similar to ex-pert paths in real-time in dynamic obstacle avoidance scenarios.Compared with the D3QN algorithm based on deep reinforcement learning,the path length is reduced by 13.3%and the success rate is increased by 17%in low-density obstacle environments;Reduce path length by 5.1%and increase success rate by 20%in high-density environments.关键词
无人机/航路规划/深度学习/逆强化学习Key words
UAV/path planning/deep learning/inverse reinforcement learning分类
社会科学引用本文复制引用
张俊驰,冼彦宁..基于深度逆强化学习的无人机实时动态避障航路规划算法[J].软件导刊,2025,24(10):47-55,9.基金项目
国家自然科学基金青年基金项目(62106179) (62106179)