| 注册
首页|期刊导航|计算机工程与应用|改进MADDPG算法的未知环境下多智能体单目标协同探索

改进MADDPG算法的未知环境下多智能体单目标协同探索

韩慧妍 石树熙 况立群 韩燮 熊风光

计算机工程与应用2025,Vol.61Issue(22):320-328,9.
计算机工程与应用2025,Vol.61Issue(22):320-328,9.DOI:10.3778/j.issn.1002-8331.2408-0131

改进MADDPG算法的未知环境下多智能体单目标协同探索

Multi-Agent Single-Goal Collaborative Exploration in Unknown Environments with Improving MADDPG Algorithm

韩慧妍 1石树熙 1况立群 1韩燮 1熊风光1

作者信息

  • 1. 中北大学 计算机科学与技术学院,太原 030051||机器视觉与虚拟现实山西省重点实验室,太原 030051||山西省视觉信息处理及智能机器人工程研究中心,太原 030051
  • 折叠

摘要

Abstract

To address the inefficiency of the multi-agent deep deterministic policy gradient(MADDPG)algorithm in unknown environments,a new multi-agent deep reinforcement learning algorithm called RE-MADDPG-C is proposed.This algo-rithm uses residual networks(ResNet)to alleviate gradient vanishing and explosion issues,enhancing convergence speed.To tackle the convergence difficulty caused by sparse rewards in single-goal exploration in unknown environments,a multi-agent intrinsic curiosity module(ICM)is introduced.The curiosity reward serves as an intrinsic motivation for agents,providing additional exploration incentives.By designing a suitable exploration reward function,agents can accomplish single-goal tasks in unknown environments.Simulation results show that the proposed algorithm achieves faster reward improvement during training,quickly completing exploration tasks.Compared to MADDPG and other algorithms,the proposed algorithm reduces training time and achieves higher global average rewards.

关键词

深度强化学习/RE-MADDPG-C/残差网络/内在好奇心模块(ICM)/奖励稀疏

Key words

deep reinforcement learning/RE-MADDPG-C/residual network/intrinsic curiosity module(ICM)/sparse rewards

分类

计算机与自动化

引用本文复制引用

韩慧妍,石树熙,况立群,韩燮,熊风光..改进MADDPG算法的未知环境下多智能体单目标协同探索[J].计算机工程与应用,2025,61(22):320-328,9.

基金项目

国家自然科学基金(62272426) (62272426)

山西省自然科学基金(202303021211153) (202303021211153)

山西省科技重大专项计划"揭榜挂帅"项目(202201150401021). (202201150401021)

计算机工程与应用

OA北大核心

1002-8331

访问量0
|
下载量1
段落导航相关论文