首页|期刊导航|重庆理工大学学报|面向匝道合流场景的多智能体强化学习SAG-MAPPO安全协同决策方法

面向匝道合流场景的多智能体强化学习SAG-MAPPO安全协同决策方法

张树培庞莹孙朋举张玮王玲德

重庆理工大学学报2025，Vol.39Issue(17)：45-52,8.

重庆理工大学学报2025，Vol.39Issue(17)：45-52,8.DOI:10.3969/j.issn.1674-8425(z).2025.09.006

面向匝道合流场景的多智能体强化学习SAG-MAPPO安全协同决策方法

Multi-agent reinforcement learning SAG-MAPPO in safe and cooperative decision-making in ramp merging scenarios

张树培 ¹庞莹 ¹孙朋举 ¹张玮 ¹王玲德¹

作者信息

1. 江苏大学汽车与交通工程学院,江苏镇江 212013
折叠

摘要

Abstract

For the ramp merging scenario involving connected and autonomous vehicles(CAVs)and human-driven vehicles(HD Vs),safety and efficiency issues arising from multi-vehicle cooperative decision-making are of paramount importance.These issues arise from partial observability and uncertainties in dynamic environments.This paper proposes a multi-agent proximal policy optimization algorithm based on temporal memory and safety constraints,termed as SAG-MAPPO.First,a distributed partially observable Markov decision process(Dec-POMDP)model is built for the ramp cooperative decision-making scenarios.Historical vehicle state information is processed by incorporating a gated recurrent unit(GRU)to mitigate policy instability caused by partial observability of the environment.Then,a dual-layer safety mechanism is designed,integrating hard rule constraints and dynamic behavior prediction.Dangerous actions are screened in real-time to ensure the safety of decision outputs.Simulation results demonstrate SAG-MAPPO achieves faster convergence,higher cumulative strategy rewards,and greater average speeds across ramp merging scenarios with varied densities,proving the effectiveness of the algorithm in complex dynamic environments.

关键词

匝道合流/自动驾驶/深度强化学习/多智能体近端策略优化/决策

Key words

ramp merging/autonomous driving/deep reinforcement learning/multi-agent proximal policy optimization/decision-making

分类

交通工程

引用本文复制引用

张树培,庞莹,孙朋举,张玮,王玲德..面向匝道合流场景的多智能体强化学习SAG-MAPPO安全协同决策方法[J].重庆理工大学学报,2025,39(17):45-52,8.

基金项目

国家自然科学基金项目(52272366) （52272366）

重庆理工大学学报

OA北大核心

ISSN：1674-8425

访问量0

下载量0

段落导航