| 注册
首页|期刊导航|高技术通讯|基于持续强化学习的自动驾驶赛车决策算法研究

基于持续强化学习的自动驾驶赛车决策算法研究

牛京玉 胡瑜 李玮 韩银和

高技术通讯2024,Vol.34Issue(1):1-14,14.
高技术通讯2024,Vol.34Issue(1):1-14,14.DOI:10.3772/j.issn.1002-0470.2024.01.001

基于持续强化学习的自动驾驶赛车决策算法研究

Decision making based on continual reinforcement learning for autonomous racing

牛京玉 1胡瑜 1李玮 2韩银和1

作者信息

  • 1. 中国科学院计算技术研究所智能计算机研究中心 北京 100190||中国科学院大学 北京 100049
  • 2. 中国科学院计算技术研究所智能计算机研究中心 北京 100190
  • 折叠

摘要

Abstract

The variety of road shapes and materials presents a serious decision-making challenge for high-speed autono-mous racing.To address the issue of dynamics gap between various roads,a decision-making algorithm based on continual reinforcement learning(CRL)is proposed.These roads are considered as different tasks.The first train-ing stage of the algorithm extracts low-dimension task features that can characterize the vehicle dynamics on differ-ent roads.These features are used to compute the task similarity.The second training stage of the algorithm pro-vides two CRL constraints for policy learning.One is the weight regularization constraint,which restricts the up-dates of policy weights that are important for old tasks.This restriction is adaptively regulated by task similarity.The other is the reward constraint,which encourages no performance degradation on old tasks while the policy is learning a new task.Racing experiments with different task sequences and CRL metrics are set to evaluate the algo-rithm.The results show that the proposed algorithm outperforms baselines without storing old tasks'data or expan-ding policy network size.

关键词

强化学习(RL)/持续学习/行为决策/自动驾驶赛车/动力学特征提取

Key words

reinforcement learning(RL)/continual learning/decision making/autonomous racing/dynam-ics feature extraction

引用本文复制引用

牛京玉,胡瑜,李玮,韩银和..基于持续强化学习的自动驾驶赛车决策算法研究[J].高技术通讯,2024,34(1):1-14,14.

基金项目

①国家自然科学基金(62176250,62003323)和中国科学院计算技术研究所计算机体系结构国家重点实验室创新项目(CARCH5203,CARCH5406)资助. (62176250,62003323)

高技术通讯

OA北大核心CSTPCD

1002-0470

访问量0
|
下载量0
段落导航相关论文