| 注册
首页|期刊导航|现代电子技术|基于复合回报函数的空战指向控制策略研究

基于复合回报函数的空战指向控制策略研究

徐俊 邓向阳 付宇鹏 岳圣智 宋婧菡 林远山

现代电子技术2026,Vol.49Issue(2):73-79,7.
现代电子技术2026,Vol.49Issue(2):73-79,7.DOI:10.16652/j.issn.1004-373x.2026.02.012

基于复合回报函数的空战指向控制策略研究

Research on air combat directional control strategy based on composite reward function

徐俊 1邓向阳 2付宇鹏 2岳圣智 1宋婧菡 1林远山1

作者信息

  • 1. 大连海洋大学 信息工程学院,辽宁 大连 116023
  • 2. 海军航空大学航空作战勤务学院,山东 烟台 264001
  • 折叠

摘要

Abstract

In allusion to the problem that the rapid pointing control of unmanned combat air vehicle(UCAVs)in close air combat is difficult to be in any posture,an air combat pointing control strategy based on the design of composite payoff function is proposed.In order to avoid the dangerous state of energy retreat after the UCAV completes the pointing task with autonomous inefficient large maneuver in air combat,a composite payoff function based on multi-dimensional constraints such as quantitative energy,time,and angle of attack is designed for pointing and aiming training of UCAV in different initial postures.In allusion to the problem of difficult convergence of strategies caused by the complex high-dimensionality of observation space and action space in air combat missions,the network parameter updating process of the dual Actor-Critic neural network structure in the training of SAC algorithm is subjected to hierarchical L2-paradigm gradient cropping,which substantially improves the convergence efficiency of the algorithm.The simulation results show that the proposed algorithm can well guide the aircraft to quickly make maneuvering decision commands to preserve energy and maneuverability to complete the pointing and aiming task,and has better convergence efficiency than traditional deep reinforcement learning algorithms,such as TD3,PPO,DDPG,and so on.

关键词

固定翼飞机/深度强化学习/回报函数塑造/空战策略/机动决策/连续空间/策略约束

Key words

fixed-wing aircraft/deep reinforcement learning/reward function shaping/aerial combat strategy/maneuvering decision-making/continuous space/strategy constraint

分类

信息技术与安全科学

引用本文复制引用

徐俊,邓向阳,付宇鹏,岳圣智,宋婧菡,林远山..基于复合回报函数的空战指向控制策略研究[J].现代电子技术,2026,49(2):73-79,7.

基金项目

辽宁省属本科高校基本科研业务费专项资金资助(2024JBZDZ004) (2024JBZDZ004)

2023中央财政对辽宁渔业补助项目 ()

辽宁省重点研发计划(2023JH26) (2023JH26)

辽宁省重点研发计划(10200015) (10200015)

辽宁省自然科学基金资助计划(2020-KF-12-09) (2020-KF-12-09)

辽宁省教育厅基本科研项目(LJKZ0730) (LJKZ0730)

辽宁省教育厅基本科研项目(QL202016) (QL202016)

设施渔业教育部重点实验室开放课题(202219) (202219)

广西重点研发计划(桂科AB23075150) (桂科AB23075150)

辽宁省应用基础计划项目(2022JH2) (2022JH2)

辽宁省应用基础计划项目(101300187) (101300187)

烟台市科技局(ZR2024QF094) (ZR2024QF094)

现代电子技术

1004-373X

访问量0
|
下载量0
段落导航相关论文