计算机工程与应用2024,Vol.60Issue(14):86-95,10.DOI:10.3778/j.issn.1002-8331.2304-0158
改进行为克隆与DDPG的无人驾驶决策模型
Improved Behavioral Cloning and DDPG's Driverless Decision Model
摘要
Abstract
The key to driverless technology is that the decision-making level makes accurate instructions based on the input information of the perception link.Reinforcement learning and imitation learning are better suited for complex scenarios than traditional rules.However,the imitation learning represented by behavioral cloning has the problem of composite error,and this paper uses the priority empirical playback algorithm to improve the behavioral cloning to improve the fitting ability of the model to the demo dataset.The original DDPG(deep deterministic policy gradient)algorithm has the problem of low exploration efficiency,and the experience pool separation and random network distillation(RND)technology are used to improve the DDPG algorithm and improve the training efficiency of DDPG algorithm.The improved algorithm is used for joint training to reduce the useless exploration in the early stage of DDPG training.Verified by TORC(the open racing car simulator)simulation platform,the experimental results show that the proposed method can explore more stable road maintenance,speed maintenance and obstacle avoidance ability within the same number of training times.关键词
无人驾驶/强化学习/模仿学习/决策算法/TORCSKey words
unmanned driving/strengthen learning/imitate learning/decision algorithm/the open racing car simulator(TORCS)分类
信息技术与安全科学引用本文复制引用
李伟东,黄振柱,何精武,马草原,葛程..改进行为克隆与DDPG的无人驾驶决策模型[J].计算机工程与应用,2024,60(14):86-95,10.基金项目
辽宁省科技创新重大专项(ZX20220560). (ZX20220560)