基于强化学习的车道级可变限速控制策略OA北大核心CSTPCD
Differential Variable Speed Limit Control Strategy Based on Reinforcement Learning
针对高速公路合流区主线各车道交通流运行状况受合流车辆影响的差异性,研究了1种基于强化学习的车道级可变限速(differential variable speed limit,DVSL)控制策略.由于DVSL控制问题存在高维动作空间求解困难,本文利用限速变化值优化动作空间,确定状态空间以及考虑多因素的奖励函数;在求解过程中,使用优质经验回放技术(prioritized experience replay,PER)进行改进,以提高训练效率和模型性能;同时提出1种车道间的安全检测机制辅助PER-DDQN展开训练,保证车道级可变限速模型可实施性.利用SUMO仿真软件测试所提出策略的控制效果,结果表明:所提出的车道级可变限速策略相较于未实施可变限速控制场景,全程行程时间降低41.88%、平均速度提高5.65%,合流区行程时间降低66.91%、平均速度提高43.42%;且车道级可变限速控制策略下合流区内各车道拥堵时间明显缩短,速度变化更加平稳.此外,还测试了智能网联车(connected-automated vehicles,CAVs)在不同渗透率场景对所提出策略的影响,渗透率在低于60%时实施车道级可变限速策略控制效果明显优于未实施可变限速控制策略,在渗透率为20%、40%和60%的场景中平均全程行程时间分别降低了41.88%、13.38%和7.46%,平均速度提高了6.08%、2.36%和1.61%;当渗透率达到80%以上时,鉴于CAVs车辆能明显改善交通流状况,实施车道级可变限速控制策略改善效果不明显.
In addressing the challenges posed by variable traffic conditions within highway merging lanes impacted by merging vehicles,a reinforcement learning(RL)model is developed for differential variable speed limit(DVSL)control.Due to the difficulty of solving the DVSL control problem with high-dimensional action space,this paper optimizes the action space by using the speed limit change value,determines the state space as well as the reward function considering multiple factors;in the solution process,it is improved by using the Prioritized Experience Re-play(PER)technique in order to improve the training efficiency and model performance;and at the same time,it proposes an inter-lane safety detection mechanism to assist the PER-DDQN to unfold the training and ensure the im-plementability of the lane-level variable velocity limit model.Furthermore,the merging area is simulated with SU-MO to examine the performance of the DVSL controller.The results reveal that,compared with the no-control sce-nario,the proposed method yields a 41.88%reduction in overall travel time and a 5.65%increase in average speed.In the merging zone,a notable 66.91%reduction in travel time and a 43.42%increase in average speed are achieved.And the RL based DVSL control strategy effectively minimizes congestion time for each lane due to smoother speed changes.Furthermore,when evaluating the impact of varying penetration scenarios on the proposed method,the RL based DVSL control strategy outperforms the no-control scenario particularly when the penetration of connected-automated vehicles(CAVs)is below 60%.In scenarios with 20%,40%,and 60%penetration rates,the average travel time is reduced by 41.88%,13.38%,and 7.46%,with corresponding average speed improvements of 6.08%,2.36%,and 1.61%,respectively.However,at penetration rate of 80%or higher,there is no significant im-provement in the DVSL control strategy due to the improvement of CAVs to the traffic flow.
白如玉;焦朋朋;陈越;张瑶
北京建筑大学通用航空技术北京实验室 北京 100044
交通运输
智能交通车道级可变限速控制策略强化学习高速合流区异质交通流
intelligent trafficdifferential variable speed limit controlcontrol strategyreinforcement learninghigh-way merging areamixed traffic flow
《交通信息与安全》 2024 (001)
105-114 / 10
国家自然科学基金项目(52172301)、国家社科基金项目(21ZDA029)、北京市社会科学基金项目(21GLA010)资助
评论