| 注册
首页|期刊导航|自动化学报(英文版)|Robust Offline Actor-Critic With On-policy Regularized Policy Evaluation

Robust Offline Actor-Critic With On-policy Regularized Policy Evaluation

Shuo Cao Xuesong Wang Yuhu Cheng

自动化学报(英文版)2024,Vol.11Issue(12):2497-2511,15.
自动化学报(英文版)2024,Vol.11Issue(12):2497-2511,15.DOI:10.1109/JAS.2024.124494

Robust Offline Actor-Critic With On-policy Regularized Policy Evaluation

Robust Offline Actor-Critic With On-policy Regularized Policy Evaluation

Shuo Cao 1Xuesong Wang 1Yuhu Cheng1

作者信息

  • 1. Engineering Research Center of Intelligent Control for Underground Space,Ministry of Education,and the School of Infor-mation and Control Engineering,China University of Mining and Tech-nology,Xuzhou 221116,China
  • 折叠

摘要

关键词

Offline reinforcement learning/off-policy QL-style/on-policy SARSA-style/policy evaluation(PE)/Q-value estimation

Key words

Offline reinforcement learning/off-policy QL-style/on-policy SARSA-style/policy evaluation(PE)/Q-value estimation

引用本文复制引用

Shuo Cao,Xuesong Wang,Yuhu Cheng..Robust Offline Actor-Critic With On-policy Regularized Policy Evaluation[J].自动化学报(英文版),2024,11(12):2497-2511,15.

基金项目

This work was supported in part by the National Natural Science Foundation of China(62176259,62373364)and the Key Research and Development Program of Jiangsu Province(BE2022095). (62176259,62373364)

自动化学报(英文版)

OACSTPCDEI

2329-9266

访问量0
|
下载量0
段落导航相关论文