黑龙江科技大学学报2025,Vol.35Issue(1):160-165,172,7.DOI:10.3969/j.issn.2095-7262.2025.01.024
改进MADDPG多智能体的方法
Methodology for improving MADDPG algorithm
宁姗 1赵秋多 2丁毓龙 2郭嘉承2
作者信息
- 1. 黑龙江科技大学 创新创业学院,哈尔滨 150022
- 2. 黑龙江科技大学 电子与信息工程学院,哈尔滨 150022
- 折叠
摘要
Abstract
This paper intends to address the environment instability of multi-agent by sharing obser-vation information and historical experiences to some extent by using multi-agent deep deterministic policy gradient algorithm(MADDPG)and proposes an IMMADDPG algorithm by improving MADDPG.The study works by analyzing and improving the network structure for effectively reducing the influence of en-vironment instability and overestimation of value function on network training policy.The results demon-strate that the tests are conducted in two different environments.With the IMMADDPG algorithm in Coop-erative Navigation environment,the probability of agents reaching their targets using IMMADDPG is 3.7%higher than that of MADDPG,and in the Predator-Prey environment,which involves both coopera-tion and competition,the strategy of agents trained with IMMADDPG results in an average of 5.79 prey captures by predator agents,and an average of 2.23 times for prey agents to their target landmarks.In contrast,with MADDPG algorithm,the average number of prey captures is 4.82,and the average times prey agents to their target landmarks is 1.76.IMMADDPG demonstrates better performance than MAD-DPG in multi-agent deep reinforcement learning environments.关键词
深度强化学习/多智能体协作/多智能体竞争/中心化训练/去中心化执行Key words
deep reinforcement learning/multi-agent collaboration/multi-agent competition/cen-tralized training/decentralized execution分类
信息技术与安全科学引用本文复制引用
宁姗,赵秋多,丁毓龙,郭嘉承..改进MADDPG多智能体的方法[J].黑龙江科技大学学报,2025,35(1):160-165,172,7.