首页|期刊导航|计算机应用研究|基于合作博弈和强化学习的优先信号控制方法

基于合作博弈和强化学习的优先信号控制方法

秦浩张维石

计算机应用研究2024，Vol.41Issue(11)：3350-3356,7.

计算机应用研究2024，Vol.41Issue(11)：3350-3356,7.DOI:10.19734/j.issn.1001-3695.2024.03.0090

基于合作博弈和强化学习的优先信号控制方法

Control method of transit signal priority based on coalitional bargaining games and reinforcement learning

秦浩 ¹张维石¹

作者信息

1. 大连海事大学信息科学技术学院,辽宁大连 116026
折叠

摘要

Abstract

This paper proposed an eight-phase transit signal priority control method,CBQL-TSP,based on a hybrid decision-making approach combining cooperative game theory and reinforcement learning,aiming to address the low efficiency of transit signal priority control in intelligent transportation systems.The method abstracted the allocation of bus signal priority into a multi-objective decision-making problem with an eight-phase signal sequence,refining the signal phase sequence into priority and non-priority phases,and constructing a cooperative game model.The paper introduced a hybrid decision-making algo-rithm,CBQL,to solve the multi-objective decision-making problem of eight-phase signal timing.The algorithm calculated the marginal contributions of each game member by solving the Shapley value function and constructed the state transition probabi-lity equation based on the Shapley value ratio.Compared to conventional control methods,CBQL-TSP incorporated the fairness principle of cooperative game theory and the adaptive learning ability of Q-learning,enabling dynamic adjustment of signal timing based on real-time traffic conditions.This method offered higher flexibility and efficiency in priority signal control.This paper established a simulation test platform to compare the average waiting car count and average travel time under different de-mand levels,evaluating the stability and efficiency of the method.The experimental results demonstrate that the proposed method exhibits higher stability.Furthermore,the overall average travel time is reduced by approximately 24.57％,and the average travel time for buses is reduced by about 37.40％.CBQL-TSP demonstrates higher control efficiency at bus-priority in-tersections,significantly reducing both overall and bus average travel times.

关键词

交通信号优先控制/合作博弈/强化学习/八相位优先信号

Key words

transit signal priority(TSP)/coalitional bargaining/reinforcement learning/eight-phase priority signal

分类

信息技术与安全科学

引用本文复制引用

秦浩,张维石..基于合作博弈和强化学习的优先信号控制方法[J].计算机应用研究,2024,41(11):3350-3356,7.

基金项目

大连市公交集团重点资助项目（）

计算机应用研究

OA北大核心CSTPCD

ISSN：1001-3695

访问量0

下载量0

段落导航