重庆理工大学学报2025,Vol.39Issue(17):45-52,8.DOI:10.3969/j.issn.1674-8425(z).2025.09.006
面向匝道合流场景的多智能体强化学习SAG-MAPPO安全协同决策方法
Multi-agent reinforcement learning SAG-MAPPO in safe and cooperative decision-making in ramp merging scenarios
摘要
Abstract
For the ramp merging scenario involving connected and autonomous vehicles(CAVs)and human-driven vehicles(HD Vs),safety and efficiency issues arising from multi-vehicle cooperative decision-making are of paramount importance.These issues arise from partial observability and uncertainties in dynamic environments.This paper proposes a multi-agent proximal policy optimization algorithm based on temporal memory and safety constraints,termed as SAG-MAPPO.First,a distributed partially observable Markov decision process(Dec-POMDP)model is built for the ramp cooperative decision-making scenarios.Historical vehicle state information is processed by incorporating a gated recurrent unit(GRU)to mitigate policy instability caused by partial observability of the environment.Then,a dual-layer safety mechanism is designed,integrating hard rule constraints and dynamic behavior prediction.Dangerous actions are screened in real-time to ensure the safety of decision outputs.Simulation results demonstrate SAG-MAPPO achieves faster convergence,higher cumulative strategy rewards,and greater average speeds across ramp merging scenarios with varied densities,proving the effectiveness of the algorithm in complex dynamic environments.关键词
匝道合流/自动驾驶/深度强化学习/多智能体近端策略优化/决策Key words
ramp merging/autonomous driving/deep reinforcement learning/multi-agent proximal policy optimization/decision-making分类
交通工程引用本文复制引用
张树培,庞莹,孙朋举,张玮,王玲德..面向匝道合流场景的多智能体强化学习SAG-MAPPO安全协同决策方法[J].重庆理工大学学报,2025,39(17):45-52,8.基金项目
国家自然科学基金项目(52272366) (52272366)