机电工程技术2026,Vol.55Issue(2):25-30,6.DOI:10.3969/j.issn.1009-9492.2026.02.004
基于值分布深度强化学习的桥式吊车轨迹规划与跟踪控制
Trajectory Planning and Tracking Control of Bridge Crane Based on Value Distribution Deep Reinforcement Learning
摘要
Abstract
Aiming at the problem that the traditional trajectory planning method in the bridge crane is complex in mathematical analysis and difficult to adapt to the dynamic environment in the complex model,a distributional soft actor-critic with three refinements(DSACT)deep reinforcement learning algorithm is proposed based on the value function distribution for real-time trajectory planning of a bridge crane.Richer state-action value distribution information can be captured by the DSACT algorithm through the distributed value function,which provides more comprehensive representation learning for decision making.Three optimization strategies(expected value substituting;twin value distribution learning;and variance-based critic gradient adjusting)are combined into the DSACT algorithm,which significantly improves the convergence speed,stability and robustness of the model.A sliding mode tracking controller is designed to achieve accurate tracking of the planned trajectory and improve the anti-interference ability and dynamic response performance of the system.Simulation results show that compared with the classical SAC and DDPG algorithms,the DSACT algorithm performs better in terms of planning accuracy,convergence speed and control stability.Compared with the traditional PID controller,the sliding mode tracking controller has significant advantages in tracking accuracy,anti-interference ability and dynamic response.关键词
桥式吊车/轨迹规划/深度强化学习/值函数分布Key words
bridge crane/trajectory planning/deep reinforcement learning/value function distribution分类
机械制造引用本文复制引用
高永锹,徐萌,辛增淼,王天雷,肖康利..基于值分布深度强化学习的桥式吊车轨迹规划与跟踪控制[J].机电工程技术,2026,55(2):25-30,6.基金项目
广东省普通高校重点科研平台和项目(2024ZDZX1009) (2024ZDZX1009)
江门市科技特派员科研合作项目(2023760300180008278) (2023760300180008278)