通信学报2025,Vol.46Issue(3):94-108,15.DOI:10.11959/j.issn.1000-436x.2025049
近端策略优化的城市环境多智能体协作对抗方法
Multi-agent cooperative confrontation with proximal policy optimization in urban environments
摘要
Abstract
To address the issue that urban environments often make command systems inefficient and inflexible due to their geospatial complexity and dynamic changes,a multi-agent cooperative confrontation method with proximal policy optimization for urban environments was proposed.First,on the basis of establishing a comprehensive urban confronta-tion environment,the AC(actor-critic)network with proximal policy optimization was used to solve the problem.Then,aiming at the multi-to-one critic network,an embedding method was adopted to address the issue of evaluating the decision-making of heterogeneous agents with different spatial dimensions.Furthermore,adaptive sampling was added to assist in the updating of proximal policy optimization.Finally,the weights of the actor network were inherited to help agents quickly take over the corresponding tasks.Experimental results show that the proposed method improves 22.67%reward and 8.14%convergence rate compared to other methods,which not only meets the decision-making of multiple agents'cooperative confrontation in urban environments,but also is compatible with the cooperative confrontation of multiple heterogeneous agents.关键词
深度强化学习/多智能体/协作对抗/近端策略优化/城市环境Key words
deep reinforcement learning/multi-agent/cooperative confrontation/proximal policy optimization/urban en-vironment分类
信息技术与安全科学引用本文复制引用
米广铭,张辉,张菁,卓力..近端策略优化的城市环境多智能体协作对抗方法[J].通信学报,2025,46(3):94-108,15.基金项目
北京市自然科学基金资助项目(No.L247025) The Beijing Natural Science Foundation(No.L247025) (No.L247025)