首页|期刊导航|计算机工程|基于图神经网络的多智能体强化学习值函数分解方法

基于图神经网络的多智能体强化学习值函数分解方法

孙文洁李宗民孙浩淼

计算机工程2024，Vol.50Issue(5)：62-70,9.

计算机工程2024，Vol.50Issue(5)：62-70,9.DOI:10.19678/j.issn.1000-3428.0067919

基于图神经网络的多智能体强化学习值函数分解方法

Multi-Agent Reinforcement Learning Value Function Factorization Approach Based on Graph Neural Network

孙文洁 ¹李宗民 ²孙浩淼¹

作者信息

1. 中国石油大学(华东)计算机科学与技术学院,山东青岛 266580
2. 中国石油大学(华东)计算机科学与技术学院,山东青岛 266580||山东石油化工学院大数据与基础科学学院,山东东营 257061
折叠

摘要

Abstract

Collaborative cooperation between agents in partially observable situations is an important problem in Multi-Agent Reinforcement Learning(MARL).The value function factorization approach solves the credit assignment problem and effectively achieves collaborative cooperation between agents.However,existing value function factorization approaches depend only on individual value functions with local information and do not allow explicit information exchange between agents,making them unsuitable for complex scenarios.To address this problem,this study introduces communication in the value function factorization approach to provide effective nonlocal information to agents,helping them understand complex environments.Furthermore,unlike existing communication approaches,the proposed approach uses a multi-layer message passing architecture based on Graph Neural Network(GNN),which extracts useful information that must be exchanged between neighboring agents.Simultaneously,the model realizes the transition from non-communication to full communication and achieves global cooperation with a limited communication range,which is suitable for real-world applications where the communication range is constrained.The results of experiments in the StarCraft Ⅱ Multi-Agent Challenge(SMAC)and Predator-Prey(PP)environments demonstrate that the average winning rate of this approach improves by 2-40 percentage points compared with those of baseline algorithms,such as QMIX and VBC,in four different scenarios of SMAC.Furthermore,the proposed approach effectively solves the PP problem in non-monotonic environments.

关键词

深度强化学习/多智能体环境/智能体协同/智能体通信/图神经网络

Key words

deep reinforcement learning/multi-agent environment/agent cooperation/agent communication/Graph Neural Network(GNN)

分类

信息技术与安全科学

引用本文复制引用

孙文洁,李宗民,孙浩淼..基于图神经网络的多智能体强化学习值函数分解方法[J].计算机工程,2024,50(5):62-70,9.

基金项目

国家重点研发计划(2019YFF0301800) （2019YFF0301800）

国家自然科学基金(61379106) （61379106）

山东省自然科学基金(ZR2013FM036,ZR2015FM011). （ZR2013FM036,ZR2015FM011）

计算机工程

OA北大核心CSTPCD

ISSN：1000-3428

访问量0

下载量0

段落导航