同济大学学报(自然科学版)2025,Vol.53Issue(6):985-992,8.DOI:10.11908/j.issn.0253-374x.23085
基于强化学习的多阶段资源分配对策模型
Reinforcement Learning-based Multi-period Game Theoretic Model for Resource Allocation
摘要
Abstract
Aiming at the defense attacker game with limited resources,a multi-period game theoretic model based on reinforcement learning for resource allocation is proposed.The defender allocates resources in deploying false targets and strengthening the genuine one within multiple periods.Multiple attackers,on the other hand,distribute resources in identifying false targets and attack the genuine one among multiple targets.In each period,each player bases their decision on the expected utility of the genuine target as the reward.The Q-learning method,one of the reinforcement learning algorithms,is adopted in the game theoretic model,exploring the best resource allocation strategy over the entire planning horizon.An illustrative example was studied to demonstrate the effectiveness of the proposed model and algorithm,which can support the decision making in the resource allocation problems.关键词
资源分配/攻防博弈/伪装目标/强化学习/Q-learningKey words
resource allocation/attacker-defender game/false targets/reinforcement learning/Q-learning分类
数学引用本文复制引用
张骁雄,丁松,彭锐,伍国华,刘忠..基于强化学习的多阶段资源分配对策模型[J].同济大学学报(自然科学版),2025,53(6):985-992,8.基金项目
国家自然科学基金(72471236) (72471236)
北京市科技新星资助项目(Z191100001119100) (Z191100001119100)
中国科协(特殊领域)青年人才托举工程项目(2021-JCJQ-QT-050) (特殊领域)