电子学报2025,Vol.53Issue(6):1885-1905,21.DOI:10.12263/DZXB.20240980
SDWN中基于多智能体图强化学习的多对多通信路由方法
A Multi-Agent Graph Reinforcement Learning Method for Many-to-Many Communication Routing in SDWN
摘要
Abstract
The many-to-many communication routing problem is an NP(Nondeterministic Polynomial time)-hard combinatorial optimization problem.Constructing efficient many-to-many communication routing paths requires timely ac-quisition of global network state information to adapt to the highly dynamic nature of network states.In this paper,within the context of software-defined wireless networks(SDWN),we address the issues present in existing data-driven multi-agent deep reinforcement learning methods,such as high computational and deployment costs,difficulty in adapting to the non-Euclidean characteristics of network topologies,excessive invalid actions during training leading to increased storage and time overheads,and slow convergence rates.This paper designs a new framework for collaborative sensing and intelli-gent decision-making between the SDN control plane and data plane and proposes a two-stage multi-agent routing method(Multi-Agent Graph deep reinforcement learning method based on intelligent node Deployment Strategy,MAGDS-M2M)to address the multi-to-multi communication routing problem.To reduce the computational and deployment costs associated with deploying agents on every node,a Q-learning-based intelligent node deployment algorithm is designed to determine the network nodes where agents need to be deployed.After completing the multi-agent deployment,a multi-to-multi routing decision method based on multi-agent graph reinforcement learning is developed within the actor-critic(AC)framework.This method redesigns the actor and critic networks using graph convolutional networks(GCN)and graph neural networks(GNN),addressing the weak adaptability of convolutional neural networks(CNN)to topological structure data in existing multi-agent reinforcement learning approaches.Additionally,to solve the issue of generating a large number of invalid ac-tions during training caused by the fixed-length action space of the Actor network,a new local observation method for the action space is proposed.Experimental results demonstrate that the proposed method reduces task completion delay by 29.33%compared to benchmark experiments and verifies that by adjusting parameters,a balance can be achieved between task completion delay and the standard deviation of cumulative energy consumption across nodes.The source code devel-oped in this work has been submitted to the open-source platform at https://github.com/GuetYe/MAGDS-M2M.关键词
多对多通信/智能节点部署/多智能体图强化学习/动作空间局部观测方法/软件定义无线网络Key words
many-to-many communication/intelligent node deployment/multi-agent graph reinforcement learning/action space local observation method/soft-ware-defined wireless networks分类
信息技术与安全科学引用本文复制引用
文鹏,叶苗,王勇,何倩,仇洪冰..SDWN中基于多智能体图强化学习的多对多通信路由方法[J].电子学报,2025,53(6):1885-1905,21.基金项目
国家自然科学基金(No.62161006,No.62372353) (No.62161006,No.62372353)
广西无线宽带通信与信号处理重点实验室基金(桂科AD25069102) (桂科AD25069102)
广西研究生教育创新计划基金(No.YCBZ2023134) (No.YCBZ2023134)
认知无线电与信息处理教育部重点实验室主任基金(No.CRKL220103) National Natural Science Foundation of China(No.62161006,No.62372353) (No.CRKL220103)
The Project of Guangxi Wireless Broadband Communication and Signal Processing Key Laboratory(No.AD25069102) (No.AD25069102)
Innovation Proj-ect of Guangxi Graduate Education(No.YCBZ2023134) (No.YCBZ2023134)
Key Laboratory of Cognitive Radio and Information Processing,Ministry of Education(Guilin University of Electronic Technology)(No.CRKL220103) (Guilin University of Electronic Technology)