航空学报2026,Vol.47Issue(8):248-263,16.DOI:10.7527/S1000-6893.2025.32786
基于可解释分层强化学习的防空反导策略优化
Optimizing air and missile defense strategies with explainable hierarchical reinforcement learning
摘要
Abstract
Air and Missile Defense(AMD)systems are core elements of a nation's aerospace security shield,and their target-interception capability is key to determining combat effectiveness.With the evolution of warfare,the AMD interception problem is increasingly characterized by large target scales,pronounced value heterogeneity,and strin-gent real-time requirements.Existing techniques typically face an interception policy space that grows exponentially with target count,poor sample efficiency under delayed rewards,and opaque decision processes,making them insuf-ficient for operational needs.To address these challenges,this paper proposes an interception strategy framework based on Explainable Hierarchical Dueling DQN(EHD-DQN).This framework suppresses exponential policy-space growth and shortens the decision chain through a hierarchical decoupling of"upper-level ranking → lower-level inter-ception".A temporally decayed multi-experience buffers is introduced to improve sample efficiency and convergence stability under delayed rewards.Moreover,an explainability module that combines Gradient-weighted Class Activation Mapping(Grad-CAM)and Local Interpretable Model-agnostic Explanations(LIME)is embedded to inject explanation signals into the training loop and provide traceable decision rationales.Compared with Deep Q-Network(DQN),Deep Deterministic Policy Gradient(DDPG),Proximal Policy Optimization(PPO),and three traditional optimization algo-rithms—Rolling-Horizon Mixed-Integer Linear Programming(RH-MILP),Non-dominated Sorting Genetic Algorithm Ⅱ(NSGA-Ⅱ),and Adaptive Large Neighborhood Search(ALNS),EHD-DQN achieves superior performance in inter-ception count,ammunition utilization,and engagement timing for high-value targets,while furnishing transparent,staff-oriented justifications for command decision-making.The results indicate that EHD-DQN offers an efficient and explainable decision-making paradigm for AMD command-and-control systems.关键词
分层强化学习/可解释性人工智能/防空反导决策/dueling DQN/协同优化Key words
hierarchical reinforcement learning/explainable artificial intelligence/air defense and anti-missile decision-making/dueling DQN/collaborative optimization分类
航空航天引用本文复制引用
刘宇衡,杨力,黄琦龙..基于可解释分层强化学习的防空反导策略优化[J].航空学报,2026,47(8):248-263,16.基金项目
国家自然科学基金(U21B2003) National Natural Science Foundation of China(U21B2003) (U21B2003)