中国舰船研究2025,Vol.20Issue(1):350-359,10.DOI:10.19693/j.issn.1673-3185.04031
基于改进PPO算法的AUV控制器设计
Design of AUV controller based on improved PPO algorithm
摘要
Abstract
[Objective]In order to improve the robustness of autonomous underwater vehicle(AUV)controllers to environment modeling errors,this paper proposes a reinforcement learning control strategy that introduces contextual information and a course-learning training mechanism.[Method]First,the contextual information is embedded into the policy network using the interaction history data as part of the policy network input;second,the course-learning training mechanism is designed to gradually increase the interfer-ence strength during the training process to avoid training instability and early stopping phenomenon caused by too much interference.Fixed-depth control experiments are conducted in a simulation environment,and the effectiveness of the algorithm is further verified using a real AUV in a tank.[Results]The experimental results show that the proposed algorithm can improve the convergence speed by 25.00%and the reward steady state value by 10.81%,effectively improving the training process.The proposed algorithm can realize static-free tracking in the simulation environment.In the tank experiment,compared with the domain randomization algorithm and baseline algorithm,the average depth position tracking error of our method was reduced by 45.81%and 63.00%respectively,and the standard deviation was reduced by 36.17%and 52.76%respectively,effectively improving tracking accuracy and stability.[Conclusion]The research results can provide useful references for the application of deep reinforcement learning methods in the field of AUV control.关键词
自主水下航行器/控制器/强化学习/课程学习/上下文变量Key words
autonomous underwater vehicles/controllers/reinforcement learning/course learning/con-text variables分类
交通工程引用本文复制引用
徐德胜,徐春晖..基于改进PPO算法的AUV控制器设计[J].中国舰船研究,2025,20(1):350-359,10.基金项目
国家重点研发计划资助项目(2022YFC2806000) (2022YFC2806000)