计算机工程与应用2019,Vol.55Issue(10):264-270,7.DOI:10.3778/j.issn.1002-8331.1806-0324
改进DDPG算法在自动驾驶中的应用
Self-Driving Via Improved DDPG Algorithm
摘要
Abstract
As a classic algorithm of deep reinforcement learning, the Deep Deterministic Policy Gradient algorithm(DDPG) has great advantage on the aspect of continuous control problems and is applied in self-driving area. In order to solve the problems of low training efficiency and large amount of illegal driving policy, an improved algorithm called failure experi-ence correction DDPG is proposed. The algorithm divides experience pool into success experience pool and failure experi-ence pool, selects failure experience according to the driving performance, controlls the brake pedal and acceleration pedal via one neural network output, and explores unknown policy through normal distribution noisy. Through the simulation on the TORCS platform, experimental results show that the proposed algorithm can significantly improve the training effi-ciency and reduce the illegal driving policy to zero.关键词
深度强化学习/自动驾驶/DDPG算法/经验缓存分离/TORCSKey words
deep reinforcement learning/self-driving/DDPG algorithm/experience pool dividing/TORCS分类
信息技术与安全科学引用本文复制引用
张斌,何明,陈希亮,吴春晓,刘斌,周波..改进DDPG算法在自动驾驶中的应用[J].计算机工程与应用,2019,55(10):264-270,7.基金项目
国家重点研发计划(No.2016YFC0800606,No.2016YFC0800310) (No.2016YFC0800606,No.2016YFC0800310)
中国工程院重点咨询课题(No.2017-XZ-05) (No.2017-XZ-05)
江苏省自然科学基金(No.BK20150721,No.BK20161469) (No.BK20150721,No.BK20161469)
中国博士后科学基金(No.2015M582786,No.2016T91017) (No.2015M582786,No.2016T91017)
江苏省重点研发计划(No.BE2015728,No.BE2016904) (No.BE2015728,No.BE2016904)
江苏省科技基础设施建设计划(No.BM2014391). (No.BM2014391)