| 注册
首页|期刊导航|指挥控制与仿真|基于深度确定性策略梯度的船舶自主航行避碰方法

基于深度确定性策略梯度的船舶自主航行避碰方法

胡正阳 王勇

指挥控制与仿真2024,Vol.46Issue(5):37-44,8.
指挥控制与仿真2024,Vol.46Issue(5):37-44,8.DOI:10.3969/j.issn.1673-3819.2024.05.006

基于深度确定性策略梯度的船舶自主航行避碰方法

A deep deterministic policy gradient method for collision avoidance of autonomous ship

胡正阳 1王勇1

作者信息

  • 1. 江苏自动化研究所,江苏 连云港 222061
  • 折叠

摘要

Abstract

This research addresses the crucial problem of collision avoidance decision making for autonomous ships under di-verse encounter situations.Building upon the Deep Deterministic Policy Gradient(DDPG)algorithm,appropriate reward functions based on the International Regulations for Preventing Collisions at Sea(COLREGS)have been designed to effec-tively guide intelligent agents in acquiring optimal strategies.By incorporating the concept of potential reward shaping,the proposed approach ensures efficient obstacle avoidance while adhering strictly to the established rules.Moreover,extensive simulations have been conducted to validate the algorithm's performance in collision avoidance for both dual-ship and multi-ship scenarios under varying encounter situations,and a comparative analysis with the TD3 algorithm has been undertaken.The obtained results demonstrate that the proposed algorithm exhibits rapid convergence and stable training performance.The resulting models successfully achieve collision-free navigation while strictly adhering to the COLREGS.Particularly,in two-ship encounter situations,the proposed algorithm outperforms the trajectory planned by the TD3 algorithm in terms of shorter path length and higher efficiency.

关键词

无人船舶/自主航行避碰/深度强化学习/COLREGS

Key words

unmanned surface vehicle/autonomous navigation obstacle avoidance/deep reinforcement learning/COLREGS

分类

军事科技

引用本文复制引用

胡正阳,王勇..基于深度确定性策略梯度的船舶自主航行避碰方法[J].指挥控制与仿真,2024,46(5):37-44,8.

指挥控制与仿真

OACSTPCD

1673-3819

访问量0
|
下载量0
段落导航相关论文