首页|期刊导航|同济大学学报（自然科学版）|典型匝道控制场景下深度强化学习决策机理解析

典型匝道控制场景下深度强化学习决策机理解析OA北大核心CSTPCD

Understanding Deep Reinforcement Learning Algorithm in Typical Ramp Metering Scenarios

中文摘要

英文摘要

以典型匝道控制场景为研究对象,利用状态值函数、显著图及输入扰动,理解深度强化学习模型在交通控制中的决策机理.利用状态值函数评判模型是否能够认识到交通状态的变化,通过显著图分析特定环境状态下模型感知到的环境状态特征和决策动作规律,应用输入扰动分析扰动后匝道控制动作匹配率和控制效果并鉴别关键区域.结果表明,基于深度强化学习的匝道控制模型能够准确评判交通状态的优劣,感知到交通状态的关键特征,并做出合理的决策动作.

This paper presents the control mechanism of deep reinforcement learning(DRL)in a typical ramp metering scenario.The state value function is used to evaluate if the DRL model has the ability to distinguish the change of state.The saliency map is used to perceive the state key features and control pattern for the DRL model under specific traffic states.By using the input perturbation,the action match ratio and control performance under perturbed data are analyzed to explore the key areas of control.The results show that the DRL model can evaluate the traffic state accurately,distinguish the key features,and then make reasonable decisions.

作者：刘冰;唐钰;暨育雄;沈煜;杜豫川

作者单位：同济大学道路与交通工程教育部重点实验室,上海 201804纽约大学坦登工程学院,纽约11201

分类：交通运输

中文关键词：交通工程深度强化学习可解释机器学习匝道控制

英文关键词：traffic engineeringdeep reinforcement learning(DRL)explainable machine learningramp metering

刊名：《同济大学学报（自然科学版）》 2024 (006)

页码/页数：928-934,981 / 8

基金： 上海市科委科研计划(19DZ1209100);浙江省重点研发计划(2021C01011)

DOI：10.11908/j.issn.0253-374x.22418

典型匝道控制场景下深度强化学习决策机理解析OA北大核心CSTPCD

Understanding Deep Reinforcement Learning Algorithm in Typical Ramp Metering Scenarios

评论