| 注册
首页|期刊导航|北京交通大学学报|基于强化学习的自动驾驶车辆路上突发障碍物换道避障控制算法

基于强化学习的自动驾驶车辆路上突发障碍物换道避障控制算法

姚恩建 陈卓利 郝赫 陈荣升 杨扬

北京交通大学学报2025,Vol.49Issue(5):82-93,12.
北京交通大学学报2025,Vol.49Issue(5):82-93,12.DOI:10.11860/j.issn.1673-0291.20250149

基于强化学习的自动驾驶车辆路上突发障碍物换道避障控制算法

Lane-change obstacle-avoidance control for autonomous vehicles under unexpected on-road obstacles based on reinforcement learning

姚恩建 1陈卓利 1郝赫 1陈荣升 1杨扬1

作者信息

  • 1. 北京交通大学 交通运输学院,北京 100044
  • 折叠

摘要

Abstract

This study addresses lane-change obstacle avoidance for autonomous vehicles under sudden road hazards and proposes SafeLC-DelayDDPG,a vehicle control algorithm based on Deep Reinforce-ment Learning(DRL).The task is formulated as a Markov Decision Process(MDP),and a structured hybrid state space is constructed by integrating local observations,lane-level semantic information,and the ego vehicle's global states to enhance environmental perception and risk sensitivity.The ac-tion space consists of continuous front-wheel steering angle and longitudinal acceleration.The reward function is centered on a two-dimensional time-to-collision(2D-TTC)metric,balancing safety,efficiency,comfort,and traffic-rule compliance,and employs a TTC-conditioned dynamic weighting mechanism that prioritizes safety under high risk and efficiency under low risk.Furthermore,delayed policy updates and target policy smoothing are introduced,and the Critic network loss is refined to mitigate the training instability and Q-value overestimation issues inherent in Deep Deterministic Policy Gradient(DDPG).The proposed method is validated through traffic simulations across diverse scenarios.Experi-mental results show that,compared with multiple baseline algorithms,SafeLC-DelayDDPG achieves su-perior safety and efficiency:during training,the first-attempt and consecutive obstacle-avoidance success rates improve by up to 17.9%and 60.5%,respectively;the safety metric by up to 7.6%;and the average speed by up to 2.1%.In cross-scenario tests,the first-attempt and consecutive success rates improve by up to 13.3%and 44.1%,the safety metric by up to 9.8%,and the average speed by up to 0.6%.

关键词

自动驾驶车辆/路上突发障碍物/深度强化学习/2D-TTC/换道避障控制

Key words

autonomous vehicles/unexpected on-road obstacles/deep reinforcement learning/2D-TTC/lane-change obstacle-avoidance control

分类

交通工程

引用本文复制引用

姚恩建,陈卓利,郝赫,陈荣升,杨扬..基于强化学习的自动驾驶车辆路上突发障碍物换道避障控制算法[J].北京交通大学学报,2025,49(5):82-93,12.

基金项目

国家重点研发计划(2023YFB4302703)National Key R&D Plan(2023YFB4302703) (2023YFB4302703)

北京交通大学学报

OA北大核心

1673-0291

访问量0
|
下载量0
段落导航相关论文