| 注册
首页|期刊导航|计算机工程与应用|改进行为克隆与DDPG的无人驾驶决策模型

改进行为克隆与DDPG的无人驾驶决策模型

李伟东 黄振柱 何精武 马草原 葛程

计算机工程与应用2024,Vol.60Issue(14):86-95,10.
计算机工程与应用2024,Vol.60Issue(14):86-95,10.DOI:10.3778/j.issn.1002-8331.2304-0158

改进行为克隆与DDPG的无人驾驶决策模型

Improved Behavioral Cloning and DDPG's Driverless Decision Model

李伟东 1黄振柱 1何精武 1马草原 1葛程1

作者信息

  • 1. 大连理工大学汽车工程学院,辽宁大连 116024
  • 折叠

摘要

Abstract

The key to driverless technology is that the decision-making level makes accurate instructions based on the input information of the perception link.Reinforcement learning and imitation learning are better suited for complex scenarios than traditional rules.However,the imitation learning represented by behavioral cloning has the problem of composite error,and this paper uses the priority empirical playback algorithm to improve the behavioral cloning to improve the fitting ability of the model to the demo dataset.The original DDPG(deep deterministic policy gradient)algorithm has the problem of low exploration efficiency,and the experience pool separation and random network distillation(RND)technology are used to improve the DDPG algorithm and improve the training efficiency of DDPG algorithm.The improved algorithm is used for joint training to reduce the useless exploration in the early stage of DDPG training.Verified by TORC(the open racing car simulator)simulation platform,the experimental results show that the proposed method can explore more stable road maintenance,speed maintenance and obstacle avoidance ability within the same number of training times.

关键词

无人驾驶/强化学习/模仿学习/决策算法/TORCS

Key words

unmanned driving/strengthen learning/imitate learning/decision algorithm/the open racing car simulator(TORCS)

分类

信息技术与安全科学

引用本文复制引用

李伟东,黄振柱,何精武,马草原,葛程..改进行为克隆与DDPG的无人驾驶决策模型[J].计算机工程与应用,2024,60(14):86-95,10.

基金项目

辽宁省科技创新重大专项(ZX20220560). (ZX20220560)

计算机工程与应用

OA北大核心CSTPCD

1002-8331

访问量2
|
下载量0
段落导航相关论文