农业机械学报2024,Vol.55Issue(4):113-123,11.DOI:10.6041/j.issn.1000-1298.2024.04.011
基于HER-TD3算法的青皮核桃采摘机械臂路径规划
Path Planning of Green Walnut Picking Robotic Arm Based on HER-TD3 Algorithm
摘要
Abstract
In response to the common problems of complex environments,large training tasks,and poor stability caused by the disorder growth of green walnut and tree branches,etc.,a harvesting device based on synchronous belt module and manipulator was designed,and the path planning of harvesting manipulator was carried out by using the twin delayed deep deterministic policy gradient with hindsight experience replay(HER-TD3)algorithm.HER algorithm was used to improve the agent's ability of exploration and alleviate the problem of sparse reward,and TD3 algorithm was used to improve the agent's stability and reduce the oscillation in training.In order to demonstrate the feasibility and generalization ability of the HER-TD3 algorithm,TD3 and HER-DDPG algorithms were introduced for comparison.Three deep reinforcement learning agents were trained by using dimensionality reduction training methods.The results showed that the success rate of the HER-TD3 algorithm model in completing path planning tasks reached 98%,which was 4 percentage points higher than that of the HER-DDPG algorithm and 19 percentage points higher than that of TD3.The 3D model simulation environment was built in CoppeliaSim software,and the initial attitude and collision detection were designed,YOLO v4 was used to recognize green walnuts,and used this algorithm model to guide the virtual harvesting robotic arm to avoid tree branches and obstacles to reach the target position,completing collision free path planning.The success rates of path planning were 91%in the absence of obstacles and 86%in the presence of obstacles.In the experiment of picking green walnut using a physical prototype,the path planning task was still well completed.The success rate of path planning for harvesting without obstacles was 86.7%,with an average motion time of 12.8 s,while the success rate in the presence of obstacles was 80.0%,with an average motion time of 13.6 s.It was verified that HER-TD3 algorithm had good adaptability and stability to complex environment.关键词
青皮核桃/采摘机器人/机械臂/HER-TD3算法/路径规划Key words
green walnut/picking robot/robotic arm/HER-TD3 algorithm/path planning分类
农业科技引用本文复制引用
杨淑华,谢晓波,邴振凯,郝建军,张秀花,袁大超..基于HER-TD3算法的青皮核桃采摘机械臂路径规划[J].农业机械学报,2024,55(4):113-123,11.基金项目
河北省重点研发计划项目(21327211D)和河北省博士研究生创新能力培养项目(CXZZBS2022050) (21327211D)