首页|期刊导航|自动化学报（英文版）|Robust Offline Actor-Critic With On-policy Regularized Policy Evaluation

Robust Offline Actor-Critic With On-policy Regularized Policy Evaluation

Shuo Cao Xuesong Wang Yuhu Cheng

自动化学报（英文版）2024，Vol.11Issue(12)：2497-2511,15.

自动化学报（英文版）2024，Vol.11Issue(12)：2497-2511,15.DOI:10.1109/JAS.2024.124494

Robust Offline Actor-Critic With On-policy Regularized Policy Evaluation

Shuo Cao ¹Xuesong Wang ¹Yuhu Cheng¹

作者信息

1. Engineering Research Center of Intelligent Control for Underground Space,Ministry of Education,and the School of Infor-mation and Control Engineering,China University of Mining and Tech-nology,Xuzhou 221116,China
折叠

摘要

关键词

Offline reinforcement learning/off-policy QL-style/on-policy SARSA-style/policy evaluation(PE)/Q-value estimation

Key words

Offline reinforcement learning/off-policy QL-style/on-policy SARSA-style/policy evaluation(PE)/Q-value estimation

引用本文复制引用

Shuo Cao,Xuesong Wang,Yuhu Cheng..Robust Offline Actor-Critic With On-policy Regularized Policy Evaluation[J].自动化学报（英文版）,2024,11(12):2497-2511,15.

基金项目

This work was supported in part by the National Natural Science Foundation of China(62176259,62373364)and the Key Research and Development Program of Jiangsu Province(BE2022095). （62176259,62373364）

自动化学报（英文版）

OACSTPCDEI

ISSN：2329-9266

访问量0

下载量0

段落导航