| 注册
首页|期刊导航|黑龙江科技大学学报|改进MADDPG多智能体的方法

改进MADDPG多智能体的方法

宁姗 赵秋多 丁毓龙 郭嘉承

黑龙江科技大学学报2025,Vol.35Issue(1):160-165,172,7.
黑龙江科技大学学报2025,Vol.35Issue(1):160-165,172,7.DOI:10.3969/j.issn.2095-7262.2025.01.024

改进MADDPG多智能体的方法

Methodology for improving MADDPG algorithm

宁姗 1赵秋多 2丁毓龙 2郭嘉承2

作者信息

  • 1. 黑龙江科技大学 创新创业学院,哈尔滨 150022
  • 2. 黑龙江科技大学 电子与信息工程学院,哈尔滨 150022
  • 折叠

摘要

Abstract

This paper intends to address the environment instability of multi-agent by sharing obser-vation information and historical experiences to some extent by using multi-agent deep deterministic policy gradient algorithm(MADDPG)and proposes an IMMADDPG algorithm by improving MADDPG.The study works by analyzing and improving the network structure for effectively reducing the influence of en-vironment instability and overestimation of value function on network training policy.The results demon-strate that the tests are conducted in two different environments.With the IMMADDPG algorithm in Coop-erative Navigation environment,the probability of agents reaching their targets using IMMADDPG is 3.7%higher than that of MADDPG,and in the Predator-Prey environment,which involves both coopera-tion and competition,the strategy of agents trained with IMMADDPG results in an average of 5.79 prey captures by predator agents,and an average of 2.23 times for prey agents to their target landmarks.In contrast,with MADDPG algorithm,the average number of prey captures is 4.82,and the average times prey agents to their target landmarks is 1.76.IMMADDPG demonstrates better performance than MAD-DPG in multi-agent deep reinforcement learning environments.

关键词

深度强化学习/多智能体协作/多智能体竞争/中心化训练/去中心化执行

Key words

deep reinforcement learning/multi-agent collaboration/multi-agent competition/cen-tralized training/decentralized execution

分类

信息技术与安全科学

引用本文复制引用

宁姗,赵秋多,丁毓龙,郭嘉承..改进MADDPG多智能体的方法[J].黑龙江科技大学学报,2025,35(1):160-165,172,7.

黑龙江科技大学学报

2095-7262

访问量4
|
下载量0
段落导航相关论文