| 注册
首页|期刊导航|无线电工程|基于Off-policy Q-学习的时延系统线性二次型跟踪控制算法

基于Off-policy Q-学习的时延系统线性二次型跟踪控制算法

刘文 蔚保国 郝菁 王卿

无线电工程2026,Vol.56Issue(1):166-176,11.
无线电工程2026,Vol.56Issue(1):166-176,11.DOI:10.3969/j.issn.1003-3106.2026.01.018

基于Off-policy Q-学习的时延系统线性二次型跟踪控制算法

Linear Quadratic Tracking Control Algorithm for Time-delay Systems Based on Off-policy Q-learning

刘文 1蔚保国 1郝菁 1王卿1

作者信息

  • 1. 卫星导航系统与装备技术国家重点实验室,河北 石家庄 050081||中国电子科技集团公司第五十四研究所,河北 石家庄 050081
  • 折叠

摘要

Abstract

A data-driven algorithm is proposed to solve the Linear Quadratic Tracking(LQT)control problem for linear discrete-time systems with unknown model parameters,which also addresses the issue of control input time delays commonly encountered in industrial processes.Through the characterization of control problems in time-delay systems,a model-driven reinforcement learning framework is constructed,based on which a Smith predictor is introduced to avoid using the mathematical model parameters and state data,and a linear quadratic tracking control algorithm for time-delay systems is proposed based on On-policy Q-learning.Considering the impact of exploration noise on the learning results in the On-policy Q-learning algorithm,an Off-policy algorithm is further adopted to solve the linear quadratic tracking control problem for time-delay systems.On this basis,the Bellman equation used in the Q-learning algorithm is improved,and a data-driven Off-policy Q-learning algorithm is presented,which remains unaffected by exploration noise and provides unbiased solutions.Theoretical analysis and simulation experiments demonstrate that tracking control for time-delay systems is effectively achieved without reliance on system mathematical model parameters or state data.

关键词

时延系统/强化学习/Off-policy/数据驱动/输出反馈

Key words

time-delay systems/reinforcement learning/Off-policy/data-driven/output feedback

分类

信息技术与安全科学

引用本文复制引用

刘文,蔚保国,郝菁,王卿..基于Off-policy Q-学习的时延系统线性二次型跟踪控制算法[J].无线电工程,2026,56(1):166-176,11.

基金项目

河北省创新能力提升计划(24460801D)Innovation Capability Promotion Plan of Hebei Province(24460801D) (24460801D)

无线电工程

1003-3106

访问量0
|
下载量0
段落导航相关论文