海空跨域协同兵棋AI架构设计及关键技术分析OACSTPCD
Architecture design and key technologies analysis of wargaming AI for sea-air cross-domain coordination
以深度强化学习为核心的智能博弈技术在游戏领域内的突破和进展为海空兵棋AI的研究提供了借鉴.智能体架构设计是需要解决的关键问题,良好的架构能够降低算法训练的复杂度和难度,加快策略收敛.提出基于随机博弈的海空跨域协同决策博弈模型,分析了相关的均衡解概念;在分析典型智能体框架基础上,针对海空兵棋推演决策博弈过程,提出基于多智能体分层强化学习的智能体双层架构,能够有效解决智能体间协作和维度灾难问题;从兵力协同、智能体网络设计、对手建模和训练机制共 4 个方面分析了关键技术.期望为海空兵棋AI设计实现提供架构指导.
The breakthrough and progress of intelligent gaming technology with deep reinforcement learning as the core in the field of games provide a method reference for the research of agents in sea-air wargames.The architecture design of the a-gent is the primary core key problem that needs to be solved,and a good architecture can reduce the complexity and difficulty of training and accelerate the convergence of policies.A stochastic game model of sea-air cross-domain cooperative decision-making has been proposed,and its corresponding equilibrium solution concepts have been analyzed.Based on the analysis of typical agent frameworks,aiming at the decision-making gaming process of sea-air wargames,and then an agent bi-level ar-chitecture based on multi-Agent hierarchical reinforcement learning is proposed,which can effectively solve the problems of collaboration and dimensional disaster.The key technologies are analyzed from four aspects:force coordination,agent net-work design,adversary modeling and training mechanism.Hoping to provide architectural guidance for the subsequent design and implementation of sea-air wargaming agents.
苏炯铭;罗俊仁;陈少飞;项凤涛
国防科技大学 智能科学学院, 湖南 长沙 410073
海空兵棋跨域协同兵棋推演多智能体智能博弈模型架构分层强化学习
sea-air wargamecross-domain cooperationwargamingmulti-agentintelligent gamingmodel architecturehierarchical reinforcement learning
《指挥控制与仿真》 2024 (002)
35-43 / 9
国家自然科学基金(61806212、62376280)
评论