| 注册
首页|期刊导航|航空学报|面向直升机姿态控制的强化学习奖励函数设计

面向直升机姿态控制的强化学习奖励函数设计

ZHANG Tao LI Pan WANG Zixu ZHU Zhenhua

航空学报2025,Vol.46Issue(z1):57-72,16.
航空学报2025,Vol.46Issue(z1):57-72,16.DOI:10.7527/S1000-6893.2025.32184

面向直升机姿态控制的强化学习奖励函数设计

Design of reward functions for helicopter attitude control in reinforcement learning

ZHANG Tao 1LI Pan 1WANG Zixu 1ZHU Zhenhua1

作者信息

  • 1. National Key Laboratory of Helicopter Dynamics,Nanjing University of Aeronautics and Astronautics,Nanjing 210016,China
  • 折叠

摘要

Abstract

Design of the reward function is one of the core technologies for helicopter attitude control based on rein-forcement learning,directly determining the training and performance of the controller.Designing a comprehensive and efficient reward function has become a key research topic in the field.To this end,a phased reward function framework is proposed,dividing the full-time domain control process into two control stages.Reward function sub-items are designed for each stage,while introducing adjustable parameters that allow macroscopic adjustment of con-trol performance.Based on the Actor-Critic method,a simple neural network attitude controller structure is designed,and the Proximal Policy Optimization algorithm(PPO)is used for training.The effectiveness of the proposed method is validated through robustness tests involving sensor error introduction and comparative experiments with the baseline reward function.100 step simulation trials show that compared to the baseline method,the number of cases where system steady-state error is less than 10%increases by 16%,the number of cases where system overshoot is less than 10%of the command amplitude increases by 9%,and the number of cases where system settling time is less than 4 s increases by 7%.Additionally,under conditions of significant sensor error,the controller can still successfully complete the attitude control task.

关键词

飞行器智能控制/直升机姿态控制/强化学习/奖励函数/神经网络

Key words

aircraft intelligent control/helicopter attitude control/reinforcement learning/reward function/neural networks

分类

航空航天

引用本文复制引用

ZHANG Tao,LI Pan,WANG Zixu,ZHU Zhenhua..面向直升机姿态控制的强化学习奖励函数设计[J].航空学报,2025,46(z1):57-72,16.

基金项目

国家级项目 National Level Project ()

航空学报

OA北大核心

1000-6893

访问量0
|
下载量0
段落导航相关论文