| 注册
首页|期刊导航|自动化学报|基于无模型策略梯度强化学习的未知随机系统最优控制

基于无模型策略梯度强化学习的未知随机系统最优控制

杜城龙 韩洁 李繁飙 桂卫华

自动化学报2025,Vol.51Issue(10):2245-2255,11.
自动化学报2025,Vol.51Issue(10):2245-2255,11.DOI:10.16383/j.aas.c250156

基于无模型策略梯度强化学习的未知随机系统最优控制

Model-free Policy Gradient-based Reinforcement Learning Algorithms for Optimal Control of Unknown Stochastic Systems

杜城龙 1韩洁 1李繁飙 1桂卫华1

作者信息

  • 1. 中南大学自动化学院 长沙 410083
  • 折叠

摘要

Abstract

This paper investigates the optimal control problem of a class of Markov stochastic jump systems(MSJSs)with unknown dynamics by two novel model-free policy gradient(PG)-based reinforcement learning(RL)algorithms.Firstly,for MSJSs with partially unknown model information,an analytical form of model-free PG is derived based on the sampling data of MSJSs and the solutions to coupled Lyapunov equations,and a partially model-free PG-based RL optimal control algorithm is proposed,where the predefined performance index is directly minimized.As the fact that the necessary data for solving the coupled Lyapunov equations and calculating the PG can be extracted from the same trajectory of the system sampling data,without the need to collect additional sampling data,the sampling complexity of the algorithm is significantly reduced.Furthermore,in order to com-pletely eliminate the dependence on the model information of MSJSs,the PG is estimated through random perturb-ation feedback gain,and a completely model-free PG-based RL algorithm is proposed to achieve optimal control of MSJSs with completely unknown dynamics.Finally,simulation results are presented to demonstrate the efficiency and superiority of the proposed two model-free PG-based RL optimal control algorithms.

关键词

随机系统/最优控制/未知动力学/策略梯度/强化学习

Key words

Stochastic systems/optimal control/unknown dynamics/policy gradient/reinforcement learning

引用本文复制引用

杜城龙,韩洁,李繁飙,桂卫华..基于无模型策略梯度强化学习的未知随机系统最优控制[J].自动化学报,2025,51(10):2245-2255,11.

基金项目

国家自然科学基金(62303492,62533005,62222317,62473383),湖南省自然科学基金(2025JJ40056,2023JJ40765),广东省基础与应用基础研究基金(2024A1515240069)资助Supported by National Natural Science Foundation of China(62303492,62533005,62222317,62473383),Natural Science Foundation of Hunan Province(2025JJ40056,2023JJ40765),and Guangdong Basic and Applied Basic Research Foundation(2024A1515240069) (62303492,62533005,62222317,62473383)

自动化学报

OA北大核心

0254-4156

访问量0
|
下载量0
段落导航相关论文