| 注册
首页|期刊导航|计算机技术与发展|基于相似度加权的无模型元强化学习方法

基于相似度加权的无模型元强化学习方法

赵春宇 赖俊 陈希亮 张人文

计算机技术与发展2024,Vol.34Issue(5):133-140,8.
计算机技术与发展2024,Vol.34Issue(5):133-140,8.DOI:10.20165/j.cnki.ISSN1673-629X.2024.0051

基于相似度加权的无模型元强化学习方法

Model-agnostic Meta Reinforcement Learning Based on Similarity Weighting

赵春宇 1赖俊 1陈希亮 1张人文1

作者信息

  • 1. 陆军工程大学 指挥控制工程学院,江苏 南京 210007
  • 折叠

摘要

Abstract

Reinforcement learning has achieved excellent performance in the fields of game games and robotics control.In order to further improve the training efficiency,meta-learning is extended to reinforcement learning,the resulting meta-reinforcement learning has become a research hotspot in the field of reinforcement learning.The quality of meta-knowledge is the key factor determining the effect of meta-reinforcement learning,and gradient-based meta-reinforcement learning takes the initial parameters of the model as meta-knowledge to guide the subsequent learning.To improve the quality of meta-knowledge,we propose a general meta-reinforcement learning method,which explicitly shows the contribution of subtasks to the training effect in the training process by weighting.The proposed method uses the similarity between the gradient update vectors obtained by different subtasks and the gradient update vectors obtained by the overall task set as update weights,improves the gradient update process,improves the quality of the meta-knowledge based on the initial parameters of the model,and makes the trained model solve the new task at a good starting point.The proposed method can be used in gradient-based reinforcement learning to quickly solve new tasks with a small number of samples.In the experiments of 2D navigation tasks and locomotion tasks,the proposed method outperforms other benchmark algorithms,which proves the rationality of weighted mechanism.

关键词

元学习/强化学习/元强化学习/梯度下降/无模型

Key words

meta-learning/reinforcement learning/meta-reinforcement learning/gradient descent/model agnostic

分类

信息技术与安全科学

引用本文复制引用

赵春宇,赖俊,陈希亮,张人文..基于相似度加权的无模型元强化学习方法[J].计算机技术与发展,2024,34(5):133-140,8.

基金项目

国家自然科学基金项目(61806221) (61806221)

计算机技术与发展

OACSTPCD

1673-629X

访问量3
|
下载量0
段落导航相关论文