| 注册
首页|期刊导航|广东工业大学学报|基于技能发现的元强化学习

基于技能发现的元强化学习

伍家威 郝志峰

广东工业大学学报2025,Vol.42Issue(3):52-61,10.
广东工业大学学报2025,Vol.42Issue(3):52-61,10.DOI:10.12052/gdutxb.240041

基于技能发现的元强化学习

Meta Reinforcement Learning Based on Skill Discovery

伍家威 1郝志峰2

作者信息

  • 1. 广东工业大学 计算机学院,广东 广州 510006
  • 2. 广东工业大学 计算机学院,广东 广州 510006||汕头大学 理学院,广东 汕头 515063
  • 折叠

摘要

Abstract

In the realm of complex environments for robot control tasks,Meta Reinforcement Learning has emerged as a pivotal component,leveraging prior experiences to tackle unseen,long-term,and sparsely rewarded intricate tasks.Skill-based Meta-RL methods aim to extract useful skills from task contexts,aiding agents in swiftly adapting to new environments during meta-testing.However,the skills learned by existing methods lack generality and adaptability,limiting their performance across meta-testing task sets.To address this,a Skill Discovery Meta-RL(SDMRL)approach is proposed in this paper,by learning more useful skills in the absence of reward functions.The objective of the SDMRL is formalized as maximizing an information-theoretic objective using maximum entropy policies,enabling agents to learn valuable skills and skill priors from unstructured data in an unsupervised manner.Experimental results in continuous control tasks such as Maze Navigation demonstrate the effectiveness of the SDMRL approach over previous meta reinforcement learning methods,and the learned skills proficiently address the long-term complex sparse reward tasks.

关键词

元强化学习/强化学习/技能发现

Key words

meta-reinforcement learning/reinforcement learning/skill discovery

分类

计算机与自动化

引用本文复制引用

伍家威,郝志峰..基于技能发现的元强化学习[J].广东工业大学学报,2025,42(3):52-61,10.

基金项目

科技创新2030——"新一代人工智能"重大项目(2021ZD0111501) (2021ZD0111501)

国家优秀青年科学基金资助项目(62122022) (62122022)

国家自然科学基金资助项目(61876043,61976052,62206064) (61876043,61976052,62206064)

广东工业大学学报

1007-7162

访问量0
|
下载量0
段落导航相关论文