首页|期刊导航|计算机工程与应用|改进DDPG算法在自动驾驶中的应用

改进DDPG算法在自动驾驶中的应用

张斌何明陈希亮吴春晓刘斌周波

计算机工程与应用2019，Vol.55Issue(10)：264-270,7.

计算机工程与应用2019，Vol.55Issue(10)：264-270,7.DOI:10.3778/j.issn.1002-8331.1806-0324

改进DDPG算法在自动驾驶中的应用

Self-Driving Via Improved DDPG Algorithm

张斌 ¹何明 ¹陈希亮 ²吴春晓 ¹刘斌 ¹周波¹

作者信息

1. 中国人民解放军陆军工程大学指挥控制工程学院,南京 210002
2. 军事科学院系统工程研究院网络信息研究所,北京 100071
折叠

摘要

Abstract

As a classic algorithm of deep reinforcement learning, the Deep Deterministic Policy Gradient algorithm(DDPG) has great advantage on the aspect of continuous control problems and is applied in self-driving area. In order to solve the problems of low training efficiency and large amount of illegal driving policy, an improved algorithm called failure experi-ence correction DDPG is proposed. The algorithm divides experience pool into success experience pool and failure experi-ence pool, selects failure experience according to the driving performance, controlls the brake pedal and acceleration pedal via one neural network output, and explores unknown policy through normal distribution noisy. Through the simulation on the TORCS platform, experimental results show that the proposed algorithm can significantly improve the training effi-ciency and reduce the illegal driving policy to zero.

关键词

深度强化学习/自动驾驶/DDPG算法/经验缓存分离/TORCS

Key words

deep reinforcement learning/self-driving/DDPG algorithm/experience pool dividing/TORCS

分类

信息技术与安全科学

引用本文复制引用

张斌,何明,陈希亮,吴春晓,刘斌,周波..改进DDPG算法在自动驾驶中的应用[J].计算机工程与应用,2019,55(10):264-270,7.

基金项目

国家重点研发计划(No.2016YFC0800606,No.2016YFC0800310) （No.2016YFC0800606,No.2016YFC0800310）

中国工程院重点咨询课题(No.2017-XZ-05) （No.2017-XZ-05）

江苏省自然科学基金(No.BK20150721,No.BK20161469) （No.BK20150721,No.BK20161469）

中国博士后科学基金(No.2015M582786,No.2016T91017) （No.2015M582786,No.2016T91017）

江苏省重点研发计划(No.BE2015728,No.BE2016904) （No.BE2015728,No.BE2016904）

江苏省科技基础设施建设计划(No.BM2014391). （No.BM2014391）

计算机工程与应用

OA北大核心CSCDCSTPCD

ISSN：1002-8331

访问量0

下载量0

段落导航