西南交通大学学报2026,Vol.61Issue(1):167-177,11.DOI:10.3969/j.issn.0258-2724.20240085
基于a-PPO算法的列检机械臂在线运动规划方法
Online Motion Planning for Inspection Manipulator Based on Adaptive Proximal Policy Optimization Algorithm
摘要
Abstract
To meet the needs of human-robot collaboration,where an inspection manipulator actively cooperates with a person under the railroad car and to enhance the convergence speed of the proximal policy optimization(PPO)algorithm,an adaptive PPO(a-PPO)algorithm was proposed and innovatively applied in the online motion planning of the inspection manipulator.Firstly,the system model was designed to immediately output policy actions based on the current environmental state.Secondly,geometric reinforcement learning was introduced to construct the reward function,utilizing the agent's exploration to continuously optimize the distribution of rewards.Thirdly,the clipping value was adaptively determined based on the policy similarity between before and after the update,and the a-PPO algorithm was developed.Finally,the improvement effects of the a-PPO algorithm were compared on two-dimensional maps,and the feasibility and effectiveness of its application were experimentally verified in both simulation and real train scenarios.The results indicate that in the two-dimensional plane simulation,the a-PPO algorithm shows certain advantages in convergence speed compared to other PPO algorithms.Additionally,the stability of paths has been improved,with the average length standard deviation being 16.786%lower than that of the PPO algorithm and 66.179%lower than that of the Informed-RRT* algorithm.In the application experiments in both simulated and real train scenarios,the manipulator demonstrates the capability to dynamically adjust target points and actively avoid dynamic obstacles during motion,reflecting its adaptability to dynamic environments.关键词
强化学习/深度学习/运动规划/机械臂/轨道列车Key words
reinforcement learning/deep learning/motion planning/manipulator/railroad car分类
信息技术与安全科学引用本文复制引用
赵舵,谢冠豪,王叶文,赵文杰,黄晨,袁昭辉..基于a-PPO算法的列检机械臂在线运动规划方法[J].西南交通大学学报,2026,61(1):167-177,11.基金项目
国家自然科学基金项目(62173279,U1934221) (62173279,U1934221)