基于图像识别的爬壁机器人的路径规划OA
Path Planning of Wall-climbing Robots Based on Image Recognition
目前对于爬壁机器人的路径规划问题,强化学习算法已被广泛运用,但在实时动态复杂工业环境下,该算法存在着数据传递滞后、状态-动作对收敛缓慢、无法估计状态-动作对的问题。本文结合单步Q-learning算法,并对该算法进行改进,提出了跟踪迹Q(λ)算法:即一种基于离散马尔可夫决策过程的经典Q学习并结合了瞬时差分算法多步回报和跟踪迹提取的算法。实践表明,该算法解决了路径规划过程中,强化学习算法存在的局部寻优能力差、实时性不强等缺点。
At present, reinforcement learning algorithm has been widely used in the path planning of wall-climbing robots, but within the real-time dynamic and complex industrial environment, the algorithm has many problems such as lag of data transfer, slow convergence of state- action,and incalculable of state-action. This paper proposes the tracking Q(λ) algorithm by combining and improving one-step Q-learning algorithm,i.e the classical Q learning based on discrete…查看全部>>
何宏;李宇;张志宏
天津理工大学自动化学院,天津市复杂系统控制理论及应用重点实验室,天津,300384天津理工大学自动化学院,天津市复杂系统控制理论及应用重点实验室,天津,300384天津广播电视台传输发射部,天津,300072
信息技术与安全科学
爬壁机器人路径规划跟踪迹Q(λ)算法
Wall-climbing robotsPath planningTrackingQ(λ) algorithm
《工业技术创新》 2015 (3)
267-271,5
天津市高等学校创新团队培养计划(项目编号TD12-5015)
评论