首页|期刊导航|数据采集与处理|基于近端策略优化算法和Mask-TIT网络的多功能雷达干扰决策方法

基于近端策略优化算法和Mask-TIT网络的多功能雷达干扰决策方法

娄雨璇孙闽红尹帅

数据采集与处理2024，Vol.39Issue(6)：1355-1369,15.

数据采集与处理2024，Vol.39Issue(6)：1355-1369,15.DOI:10.16337/j.1004-9037.2024.06.006

基于近端策略优化算法和Mask-TIT网络的多功能雷达干扰决策方法

A Multi-functional Radar Jamming Decision Method Based on Proximal Policy Op-timization Algorithm and Mask-TIT Network

娄雨璇 ¹孙闽红 ¹尹帅¹

作者信息

1. 杭州电子科技大学通信工程学院,杭州 310018
折叠

摘要

Abstract

To cope with the challenges brought by increasingly intelligent multifunctional radars to the opposing side,this paper proposes an jamming decision-making method based on the proximal policy optimization(PPO)algorithm and the Mask-Transformer in Transformer(Mask-TIT)network.Firstly,starting from a realistic scenario,the adversarial scene between the jammer and the radar is modeled as a partially observable Markov decision process(POMDP).A new state transition function and reward function are designed based on the working principles of the radar,and the observation space is designed according to the hierarchy of the multifunctional radar model.Secondly,a Mask-TIT network structure is designed using the Transformer's representation capacity for sequence data and the characteristics of radar jamming patterns,which is used to build a more powerful Actor-Critic network architecture.Finally,the PPO algorithm is used for optimization learning.Experimental results show that compared with existing methods,the proposed algorithm reduces the average amount of interactive data required for convergence by 25.6%,and the variance after convergence is significantly reduced.

关键词

雷达干扰决策/部分可观察马尔可夫决策过程/强化学习/Transformer/近端策略优化

Key words

radar jamming decision/partially observable Markov decision process(POMDP)/reinforcement learning/Transformer/proximal policy optimization(PPO)

分类

信息技术与安全科学

引用本文复制引用

娄雨璇,孙闽红,尹帅..基于近端策略优化算法和Mask-TIT网络的多功能雷达干扰决策方法[J].数据采集与处理,2024,39(6):1355-1369,15.

数据采集与处理

OA北大核心CSTPCD

ISSN：1004-9037

访问量0

下载量0

段落导航