| 注册
首页|期刊导航|同济大学学报(自然科学版)|基于多智能体深度强化学习的高速公路可变限速协同控制方法

基于多智能体深度强化学习的高速公路可变限速协同控制方法

余荣杰 徐灵 章锐辞

同济大学学报(自然科学版)2024,Vol.52Issue(7):1089-1098,10.
同济大学学报(自然科学版)2024,Vol.52Issue(7):1089-1098,10.DOI:10.11908/j.issn.0253-374x.22441

基于多智能体深度强化学习的高速公路可变限速协同控制方法

Coordinated Variable Speed Limit Control for Freeway Based on Multi-Agent Deep Reinforcement Learning

余荣杰 1徐灵 2章锐辞1

作者信息

  • 1. 同济大学 道路与交通工程教育部重点实验室,上海 201804
  • 2. 浙江杭绍甬高速公路有限公司,浙江 杭州 310000
  • 折叠

摘要

Abstract

In order to meet the needs of coordinated variable speed limit(VSL)control of multi-segment on freeways,and to solve the problem of efficient training optimization in high-dimensional parameter space,a multi-agent deep deterministic policy gradient(MADDPG)algorithm is proposed for freeway VSL control.Different from the existing research on the single agent Deep Deterministic Policy Gradient(DDPG)algorithm,MADDPG abstracts each control unit as an agent with Actor-Critic reinforcement learning architecture,and shares each agent in the algorithm training process.The state and action information of the agents enable each agent to have the ability to infer the control strategies of other agents,thereby realizing multi-segment coordinated control.Based on the open source simulation software SUMO,the effect of the control method proposed is verified in a typical freeway traffic jam scenario.The experimental results show that the MADDPG algorithm proposed reduces the traffic jam duration and the speed standard deviation by 69.23%and 47.96%respectively,which can significantly improve the traffic efficiency and safety.Compared with the single-agent DDPG algorithm,MADDPG can save 50%of the training time and increase the cumulative return value by 7.44%.The multi-agent algorithm can improve the optimization efficiency of the collaborative control strategy.Further,in order to verify the necessity of sharing information among agents,MADDPG is compared with the independent DDPG(IDDPG)algorithm:It is shown that MADDPG can improve the traffic jam duration and speed standard deviation by 11.65%,19.00%respectively.

关键词

交通工程/可变限速协同控制/多智能体深度强化学习/交通拥堵/高速公路/交通效率/交通安全

Key words

traffic engineering/coordinated variable speed limit control/multi-agent deep reinforcement learning/traffic jam/freeway/traffic efficiency/traffic safety

分类

交通工程

引用本文复制引用

余荣杰,徐灵,章锐辞..基于多智能体深度强化学习的高速公路可变限速协同控制方法[J].同济大学学报(自然科学版),2024,52(7):1089-1098,10.

基金项目

浙江省交通运输厅科技计划项目(2021047) (2021047)

同济大学学报(自然科学版)

OA北大核心CSTPCD

0253-374X

访问量0
|
下载量0
段落导航相关论文