基于深度确定性策略梯度的船舶自主航行避碰方法OACSTPCD
A deep deterministic policy gradient method for collision avoidance of autonomous ship
针对不同会遇态势下的船舶自主航行避碰决策问题,在DDPG(Deep Deterministic Policy Gradient)算法基础上,以国际航行规则(COLREGS)为基准设计相应的奖励函数,通过引入势能回报塑形的思想来引导智能体学习最佳策略,保障了智能体在遵守规则的前提下能够有效避障到达航行目标点.最后,作者对双船和多船分别在不同会遇场景下避障问题进行了仿真验证,并与TD3 算法进行比较.结果表明:作者设计的算法收敛快,训练效果平稳;生成…查看全部>>
This research addresses the crucial problem of collision avoidance decision making for autonomous ships under di-verse encounter situations.Building upon the Deep Deterministic Policy Gradient(DDPG)algorithm,appropriate reward functions based on the International Regulations for Preventing Collisions at Sea(COLREGS)have been designed to effec-tively guide intelligent agents in acquiring optimal strategies.By incorporating the concept of potential rewar…查看全部>>
胡正阳;王勇
江苏自动化研究所,江苏 连云港 222061江苏自动化研究所,江苏 连云港 222061
无人船舶自主航行避碰深度强化学习COLREGS
unmanned surface vehicleautonomous navigation obstacle avoidancedeep reinforcement learningCOLREGS
《指挥控制与仿真》 2024 (5)
37-44,8
评论