数据采集与处理2024,Vol.39Issue(6):1355-1369,15.DOI:10.16337/j.1004-9037.2024.06.006
基于近端策略优化算法和Mask-TIT网络的多功能雷达干扰决策方法
A Multi-functional Radar Jamming Decision Method Based on Proximal Policy Op-timization Algorithm and Mask-TIT Network
娄雨璇 1孙闽红 1尹帅1
作者信息
- 1. 杭州电子科技大学通信工程学院,杭州 310018
- 折叠
摘要
Abstract
To cope with the challenges brought by increasingly intelligent multifunctional radars to the opposing side,this paper proposes an jamming decision-making method based on the proximal policy optimization(PPO)algorithm and the Mask-Transformer in Transformer(Mask-TIT)network.Firstly,starting from a realistic scenario,the adversarial scene between the jammer and the radar is modeled as a partially observable Markov decision process(POMDP).A new state transition function and reward function are designed based on the working principles of the radar,and the observation space is designed according to the hierarchy of the multifunctional radar model.Secondly,a Mask-TIT network structure is designed using the Transformer's representation capacity for sequence data and the characteristics of radar jamming patterns,which is used to build a more powerful Actor-Critic network architecture.Finally,the PPO algorithm is used for optimization learning.Experimental results show that compared with existing methods,the proposed algorithm reduces the average amount of interactive data required for convergence by 25.6%,and the variance after convergence is significantly reduced.关键词
雷达干扰决策/部分可观察马尔可夫决策过程/强化学习/Transformer/近端策略优化Key words
radar jamming decision/partially observable Markov decision process(POMDP)/reinforcement learning/Transformer/proximal policy optimization(PPO)分类
信息技术与安全科学引用本文复制引用
娄雨璇,孙闽红,尹帅..基于近端策略优化算法和Mask-TIT网络的多功能雷达干扰决策方法[J].数据采集与处理,2024,39(6):1355-1369,15.