计算机技术与发展2024,Vol.34Issue(5):133-140,8.DOI:10.20165/j.cnki.ISSN1673-629X.2024.0051
基于相似度加权的无模型元强化学习方法
Model-agnostic Meta Reinforcement Learning Based on Similarity Weighting
摘要
Abstract
Reinforcement learning has achieved excellent performance in the fields of game games and robotics control.In order to further improve the training efficiency,meta-learning is extended to reinforcement learning,the resulting meta-reinforcement learning has become a research hotspot in the field of reinforcement learning.The quality of meta-knowledge is the key factor determining the effect of meta-reinforcement learning,and gradient-based meta-reinforcement learning takes the initial parameters of the model as meta-knowledge to guide the subsequent learning.To improve the quality of meta-knowledge,we propose a general meta-reinforcement learning method,which explicitly shows the contribution of subtasks to the training effect in the training process by weighting.The proposed method uses the similarity between the gradient update vectors obtained by different subtasks and the gradient update vectors obtained by the overall task set as update weights,improves the gradient update process,improves the quality of the meta-knowledge based on the initial parameters of the model,and makes the trained model solve the new task at a good starting point.The proposed method can be used in gradient-based reinforcement learning to quickly solve new tasks with a small number of samples.In the experiments of 2D navigation tasks and locomotion tasks,the proposed method outperforms other benchmark algorithms,which proves the rationality of weighted mechanism.关键词
元学习/强化学习/元强化学习/梯度下降/无模型Key words
meta-learning/reinforcement learning/meta-reinforcement learning/gradient descent/model agnostic分类
信息技术与安全科学引用本文复制引用
赵春宇,赖俊,陈希亮,张人文..基于相似度加权的无模型元强化学习方法[J].计算机技术与发展,2024,34(5):133-140,8.基金项目
国家自然科学基金项目(61806221) (61806221)