| 注册
首页|期刊导航|中国科学院大学学报|面向Ad-Hoc协作的局部观测重建方法

面向Ad-Hoc协作的局部观测重建方法

陈皓 杨立昆 尹奇跃 黄凯奇

中国科学院大学学报2024,Vol.41Issue(1):117-126,10.
中国科学院大学学报2024,Vol.41Issue(1):117-126,10.DOI:10.7523/j.ucas.2022.028

面向Ad-Hoc协作的局部观测重建方法

Local observation reconstruction for Ad-Hoc cooperation

陈皓 1杨立昆 1尹奇跃 1黄凯奇2

作者信息

  • 1. 中国科学院自动化研究所智能系统与工程研究中心,北京 100190||中国科学院大学人工智能学院,北京 100049
  • 2. 中国科学院自动化研究所智能系统与工程研究中心,北京 100190||中国科学院大学人工智能学院,北京 100049||中国科学院脑科学与智能技术卓越创新中心,上海 200031
  • 折叠

摘要

Abstract

In recent years,multi-agent reinforcement learning has received a lot of attention from researchers.In the study of multi-agent reinforcement learning,the question of how to perform ad-hoc cooperation,i.e.,how to adapt to a changing variety and number of teammates,is a key problem.Existing methods either have strong prior knowledge assumptions or use hard-coded protocols for cooperation,which lack generality and can not be generalized to more general ad-hoc cooperation scenarios.To address this problem,this paper proposes a local observation reconstruction algorithm for ad-hoc cooperation,which uses attention mechanisms and sampling networks to reconstruct local observations,enabling the algorithm to recognize and make full use of high-dimensional state representations in different situations and achieve zero-shot generalization in ad-hoc cooperation scenarios.In this paper,the performance of the algorithm is compared and analyzed with representative algorithms on the StarCraft micromanagement environment and ad-hoc cooperation scenarios to verify the effectiveness of the algorithm.

关键词

多智能体/深度强化学习/信用分配/Ad-Hoc协作

Key words

multi-agent/deep reinforcement learning/credit assignment/Ad-Hoc cooperation

分类

信息技术与安全科学

引用本文复制引用

陈皓,杨立昆,尹奇跃,黄凯奇..面向Ad-Hoc协作的局部观测重建方法[J].中国科学院大学学报,2024,41(1):117-126,10.

基金项目

国家自然科学基金(61876181),北京市科技创新计划(Z19110000119043),青年创新促进会、中国科学院和中国科学院项目(QYZDB-SSWJSC006)资助 (61876181)

中国科学院大学学报

OA北大核心CSTPCD

2095-6134

访问量0
|
下载量0
段落导航相关论文