自动化学报2025,Vol.51Issue(6):1170-1190,21.DOI:10.16383/j.aas.c240706
数据驱动自适应评判控制研究进展
Research Advances on Data-driven Adaptive Critic Control
摘要
Abstract
The fusion and development of optimal control and artificial intelligence yields adaptive dynamic pro-gramming(ADP)methods,which are primarily constructed based on the actor-critic design.By integrating dynam-ic programming theory,reinforcement learning mechanisms,neural network technologies,and function optimization algorithms,ADP has achieved significant progress in solving decision-making and control problems for large-scale complex nonlinear systems.However,the unknown parameters and uncertain disturbances of actual systems often make it difficult to establish accurate mathematical models,posing challenges to the design of optimal controllers.In recent years,data-driven ADP methods with strong self-learning and adaptive capabilities have received wide-spread attention.ADP methods can design stable,safe,and reliable optimal controllers for complex nonlinear sys-tems using only the input-output data of the system without relying on dynamical models,aligning with the trend of intelligent automation.This paper comprehensively reviews the algorithm implementation,theoretical character-istics,and related applications of data-driven ADP methods,emphasizing the latest research progress,including on-line Q-learning,value-iteration-based Q-learning,policy-iteration-based Q-learning,accelerated Q-learning,transfer Q-learning,tracking Q-learning,safe Q-learning and game Q-learning.This paper also covers the analysis of data learning paradigms,stability,convergence,and optimality.Furthermore,in order to enhance learning efficiency and control performance,this paper designs some improved critic schemes and utility functions.Finally,with the back-ground of wastewater treatment processes,this paper summarizes the application effects and existing issues of data-driven ADP approaches in practical industrial systems,and outlines several future research directions.关键词
自适应评判控制/自适应动态规划/数据驱动设计/在线Q学习/迭代Q学习Key words
Adaptive critic control/adaptive dynamic programming/data-driven design/online Q-learning/iterat-ive Q-learning引用本文复制引用
王鼎,赵明明,刘德荣,乔俊飞,宋世杰..数据驱动自适应评判控制研究进展[J].自动化学报,2025,51(6):1170-1190,21.基金项目
国家自然科学基金(62222301,62473012,62021003),国家科技重大专项(2021ZD0112302)资助Supported by National Natural Science Foundation of China(62222301,62473012,62021003)and National Science and Tech-nology Major Project(2021ZD0112302) (62222301,62473012,62021003)