首页|期刊导航|同济大学学报（自然科学版）|基于强化学习的多阶段资源分配对策模型

基于强化学习的多阶段资源分配对策模型

张骁雄丁松彭锐伍国华刘忠

同济大学学报（自然科学版）2025，Vol.53Issue(6)：985-992,8.

同济大学学报（自然科学版）2025，Vol.53Issue(6)：985-992,8.DOI:10.11908/j.issn.0253-374x.23085

基于强化学习的多阶段资源分配对策模型

Reinforcement Learning-based Multi-period Game Theoretic Model for Resource Allocation

张骁雄 ¹丁松 ²彭锐 ³伍国华 ⁴刘忠⁵

作者信息

1. 国防科技大学第六十三研究所,江苏南京 210007||国防科技大学大数据与决策实验室,湖南长沙 410073
2. 浙江财经大学经济学院,浙江杭州 310018
3. 北京工业大学经济与管理学院,北京 100124
4. 中南大学交通运输工程学院,湖南长沙 410075
5. 国防科技大学大数据与决策实验室,湖南长沙 410073
折叠

摘要

Abstract

Aiming at the defense attacker game with limited resources,a multi-period game theoretic model based on reinforcement learning for resource allocation is proposed.The defender allocates resources in deploying false targets and strengthening the genuine one within multiple periods.Multiple attackers,on the other hand,distribute resources in identifying false targets and attack the genuine one among multiple targets.In each period,each player bases their decision on the expected utility of the genuine target as the reward.The Q-learning method,one of the reinforcement learning algorithms,is adopted in the game theoretic model,exploring the best resource allocation strategy over the entire planning horizon.An illustrative example was studied to demonstrate the effectiveness of the proposed model and algorithm,which can support the decision making in the resource allocation problems.

关键词

资源分配/攻防博弈/伪装目标/强化学习/Q-learning

Key words

resource allocation/attacker-defender game/false targets/reinforcement learning/Q-learning

分类

数理科学

引用本文复制引用

张骁雄,丁松,彭锐,伍国华,刘忠..基于强化学习的多阶段资源分配对策模型[J].同济大学学报（自然科学版）,2025,53(6):985-992,8.

基金项目

国家自然科学基金(72471236) （72471236）

北京市科技新星资助项目(Z191100001119100) （Z191100001119100）

中国科协(特殊领域)青年人才托举工程项目(2021-JCJQ-QT-050) （特殊领域）

同济大学学报（自然科学版）

OA北大核心

ISSN：0253-374X

访问量1

下载量0

段落导航