| 注册
首页|期刊导航|无线电通信技术|基于离线强化学习的研究综述

基于离线强化学习的研究综述

陈锶奇 耿婕 汪云飞 余伟驰 赵佳宁 王仕超

无线电通信技术2024,Vol.50Issue(5):831-842,12.
无线电通信技术2024,Vol.50Issue(5):831-842,12.DOI:10.3969/j.issn.1003-3114.2024.05.001

基于离线强化学习的研究综述

Survey of Research on Offline Reinforcement Learning

陈锶奇 1耿婕 2汪云飞 1余伟驰 1赵佳宁 3王仕超1

作者信息

  • 1. 重庆交通大学信息科学与工程学院,重庆 400074
  • 2. 天津大学胸科医院,天津 300072
  • 3. 天津大学智能与计算学部,天津 300072
  • 折叠

摘要

Abstract

Offline reinforcement learning,as an emerging paradigm,leverages a vast amount of offline data for learning without the need of active interactions with the environment.It demonstrates high potential and value,especially in high-risk fields such as health-care and autonomous driving.This review will sequentially unfold from the basic concepts of offline reinforcement learning,core issues,main methods,and focus on introducing various strategies to mitigate distributional shift.These include constraining target policy and behavior policy alignment,value function constraints,quantification of model uncertainty,and model-based offline reinforcement learn-ing methods.Finally,the article discusses current simulation environments for offline reinforcement learning and significant application scenarios.

关键词

强化学习/离线强化学习/自动决策/外推误差

Key words

reinforcement learning/offline reinforcement learning/automated decision-making/extrapolation errors

分类

电子信息工程

引用本文复制引用

陈锶奇,耿婕,汪云飞,余伟驰,赵佳宁,王仕超..基于离线强化学习的研究综述[J].无线电通信技术,2024,50(5):831-842,12.

基金项目

国家自然科学基金(61602391) (61602391)

天津市科技计划项目(22JCZDJC00580) National Natural Science Foundation of China(61602391) (22JCZDJC00580)

Tianjin Science and Technology Plan Project(22JCZDJC00580) (22JCZDJC00580)

无线电通信技术

OA北大核心

1003-3114

访问量0
|
下载量0
段落导航相关论文