| 注册
首页|期刊导航|计算机工程|基于优势后见经验回放的强化学习导航方法

基于优势后见经验回放的强化学习导航方法

王少桐 况立群 韩慧妍 熊风光 薛红新

计算机工程2024,Vol.50Issue(1):313-319,7.
计算机工程2024,Vol.50Issue(1):313-319,7.DOI:10.19678/j.issn.1000-3428.0066193

基于优势后见经验回放的强化学习导航方法

Reinforcement Learning Navigation Method Based on Advantage Hindsight Experience Replay

王少桐 1况立群 1韩慧妍 1熊风光 1薛红新1

作者信息

  • 1. 中北大学计算机科学与技术学院,山西 太原 030051
  • 折叠

摘要

Abstract

Reinforcement learning demonstrates significant potential in the field of mobile robots.By combining reinforcement learning algorithms with robot navigation,the autonomous piloting of robots can be achieved without prior knowledge.However,robot reinforcement learning is associated with some disadvantages,such as low sample utilization ratios and poor generalization ability.Hence,based on the D3QN algorithm,this paper proposes an advantage hindsight experience replay algorithm for the playback of experience samples.First,the advantage function value of trajectory points in trajectory samples is calculated,and the point with the maximum advantage function is selected as the target point.Subsequently,the trajectory samples are relabeled,and the old and new trajectory samples are placed simultaneously into the experience pool to increase the diversity of experience samples,thus allowing the agent to learn to navigate to the target point more efficiently by learning the failed experience samples.To assess the validity of the proposed approach,different experimental environments are established using the Gazebo platform,and a TurtleBot3 robot is used to conduct navigation training and transfer tests in the simulation environment.The results show that the navigation success rate in the training environment is higher than that yielded by the current mainstream algorithm,and that the maximum navigation success rate achieved in the transfer test environment is 86.33%.Improving the algorithm can enhance the utilization ratio of navigation samples,reduce the difficulty of learning navigation strategies,and enhance the autonomous navigation ability and migration generalization ability of the robot in different environments.

关键词

强化学习/移动机器人/后见经验回放/神经网络/样本利用率

Key words

reinforcement learning/mobile robots/hindsight experience replay/neural network/sample utilization

分类

信息技术与安全科学

引用本文复制引用

王少桐,况立群,韩慧妍,熊风光,薛红新..基于优势后见经验回放的强化学习导航方法[J].计算机工程,2024,50(1):313-319,7.

基金项目

国家自然科学基金(62106238) (62106238)

山西省回国留学人员科研项目(2020-113) (2020-113)

山西省科学成果转化引导专项(202104021301055). (202104021301055)

计算机工程

OA北大核心CSTPCD

1000-3428

访问量0
|
下载量0
段落导航相关论文