| 注册
首页|期刊导航|自动化学报|基于表征学习的离线强化学习方法研究综述

基于表征学习的离线强化学习方法研究综述

王雪松 王荣荣 程玉虎

自动化学报2024,Vol.50Issue(6):1104-1128,25.
自动化学报2024,Vol.50Issue(6):1104-1128,25.DOI:10.16383/j.aas.c230546

基于表征学习的离线强化学习方法研究综述

A Review of Offline Reinforcement Learning Based on Representation Learning

王雪松 1王荣荣 1程玉虎1

作者信息

  • 1. 中国矿业大学信息与控制工程学院 徐州 221116
  • 折叠

摘要

Abstract

Reinforcement learning(RL),learning an optimal policy through online interaction between an agent and environment,has recently become an important tool to solve perceptual decision-making issues in complex en-vironments.However,the online data collection may raise issues of security,time,or cost,greatly limiting the prac-tical applications of reinforcement learning.Meanwhile,tackling intricate high-dimensional data input problems has also become a significant challenge for reinforcement learning due to the intricate and multifaceted nature of raw data.Fortunately,offline reinforcement learning based on representation learning can learn the policy only from his-torical experience data without interacting with the environment.It utilizes representation learning techniques to map the features of the offline dataset into low-dimensional vectors,which are subsequently employed to train the offline reinforcement learning model.This data-driven paradigm provides a new opportunity to realize the general artificial intelligence.To this end,this paper comprehensively reviews the recent research on offline reinforcement learning based on representation learning.Firstly,the problem setup of offline reinforcement learning is given.Then,the existing technologies are summarized from three aspects:Methodologies,benchmarks,offline policy evaluation and hyperparameter selection.Moreover,the study trends of offline reinforcement learning in industries,recom-mendation systems,intelligent driving,and other fields are introduced.Finally,the conclusion is drawn and the key challenges and development trends of offline reinforcement learning based on representation learning in the future are discussed,so as to provide a valuable reference for subsequent study.

关键词

强化学习/离线强化学习/表征学习/历史经验数据/分布偏移

Key words

Reinforcement learning(RL)/offline reinforcement learning/representation learning/historical experi-ence data/distribution shift

引用本文复制引用

王雪松,王荣荣,程玉虎..基于表征学习的离线强化学习方法研究综述[J].自动化学报,2024,50(6):1104-1128,25.

基金项目

国家自然科学基金(62373364,62176259),江苏省重点研发计划项目(BE2022095)资助 Supported by National Natural Science Foundation of China(62373364,62176259)and Key Research and Development Pro-gram of Jiangsu Province(BE2022095) (62373364,62176259)

自动化学报

OA北大核心CSTPCD

0254-4156

访问量1
|
下载量0
段落导航相关论文