首页|期刊导航|自动化学报|数据驱动自适应评判控制研究进展

数据驱动自适应评判控制研究进展

王鼎赵明明刘德荣乔俊飞宋世杰

自动化学报2025，Vol.51Issue(6)：1170-1190,21.

自动化学报2025，Vol.51Issue(6)：1170-1190,21.DOI:10.16383/j.aas.c240706

数据驱动自适应评判控制研究进展

Research Advances on Data-driven Adaptive Critic Control

王鼎 ¹赵明明 ¹刘德荣 ²乔俊飞 ¹宋世杰³

作者信息

1. 北京工业大学信息科学技术学院北京 100124||计算智能与智能系统北京市重点实验室北京 100124||智慧环保北京实验室北京 100124||北京人工智能研究院北京 100124
2. 南方科技大学自动化与智能制造学院深圳 518055
3. 西南交通大学智慧城市与交通学院成都 611756
折叠

摘要

Abstract

The fusion and development of optimal control and artificial intelligence yields adaptive dynamic pro-gramming(ADP)methods,which are primarily constructed based on the actor-critic design.By integrating dynam-ic programming theory,reinforcement learning mechanisms,neural network technologies,and function optimization algorithms,ADP has achieved significant progress in solving decision-making and control problems for large-scale complex nonlinear systems.However,the unknown parameters and uncertain disturbances of actual systems often make it difficult to establish accurate mathematical models,posing challenges to the design of optimal controllers.In recent years,data-driven ADP methods with strong self-learning and adaptive capabilities have received wide-spread attention.ADP methods can design stable,safe,and reliable optimal controllers for complex nonlinear sys-tems using only the input-output data of the system without relying on dynamical models,aligning with the trend of intelligent automation.This paper comprehensively reviews the algorithm implementation,theoretical character-istics,and related applications of data-driven ADP methods,emphasizing the latest research progress,including on-line Q-learning,value-iteration-based Q-learning,policy-iteration-based Q-learning,accelerated Q-learning,transfer Q-learning,tracking Q-learning,safe Q-learning and game Q-learning.This paper also covers the analysis of data learning paradigms,stability,convergence,and optimality.Furthermore,in order to enhance learning efficiency and control performance,this paper designs some improved critic schemes and utility functions.Finally,with the back-ground of wastewater treatment processes,this paper summarizes the application effects and existing issues of data-driven ADP approaches in practical industrial systems,and outlines several future research directions.

关键词

自适应评判控制/自适应动态规划/数据驱动设计/在线Q学习/迭代Q学习

Key words

Adaptive critic control/adaptive dynamic programming/data-driven design/online Q-learning/iterat-ive Q-learning

引用本文复制引用

王鼎,赵明明,刘德荣,乔俊飞,宋世杰..数据驱动自适应评判控制研究进展[J].自动化学报,2025,51(6):1170-1190,21.

基金项目

国家自然科学基金(62222301,62473012,62021003),国家科技重大专项(2021ZD0112302)资助Supported by National Natural Science Foundation of China(62222301,62473012,62021003)and National Science and Tech-nology Major Project(2021ZD0112302) （62222301,62473012,62021003）

自动化学报

OA北大核心

ISSN：0254-4156

访问量15

下载量0

段落导航