首页|期刊导航|信息与控制|基于双行动者深度确定性策略梯度算法的间歇过程控制

基于双行动者深度确定性策略梯度算法的间歇过程控制

马军伟徐琛陶洪峰杨慧中

信息与控制2023，Vol.52Issue(6)：773-783,810,12.

信息与控制2023，Vol.52Issue(6)：773-783,810,12.DOI:10.13976/j.cnki.xk.2023.2488

基于双行动者深度确定性策略梯度算法的间歇过程控制

Batch Process Control Based on Twin-actor Deep Deterministic Policy Gradient Algorithm

马军伟 ¹徐琛 ¹陶洪峰 ¹杨慧中¹

作者信息

1. 江南大学轻工过程先进控制教育部重点实验室,江苏无锡 214122
折叠

摘要

Abstract

We propose a batch process control scheme without a process model by combining reinforce-ment learning(RL)to solve the problem that conventional model-based control methods have inac-curate models because of their complex nonlinear dynamics when dealing with batch process tasks,which affects control performance.First,the method solves the problem of high estimation of the value function in deep RL algorithms by the structure of twin-actor parallel training to improve the learning efficiency of the algorithm.Second,an independent experience pool is established for each actor to maintain the independence of the twin actors.Furthermore,a novel reward function is established for the RL controller to guide the process back to the predetermined trajectory;we mitigate the temporal difference(TD)error accumulation problem in parameter updating by intro-ducing a delayed policy update method.Finally,the effectiveness of the controller based on the twin-actor deep deterministic policy gradient algorithm for batch process control is demonstrated by simulating the penicillin fermentation process.

关键词

间歇过程/模型未知/强化学习/行动者-评论家框架/策略梯度算法

Key words

batch process/model unknown/reinforcement learning/actor-critic framework/policy gradient algorithm

分类

信息技术与安全科学

引用本文复制引用

马军伟,徐琛,陶洪峰,杨慧中..基于双行动者深度确定性策略梯度算法的间歇过程控制[J].信息与控制,2023,52(6):773-783,810,12.

基金项目

国家自然科学基金(62103167) （62103167）

江苏省自然科学基金(BK20210451) （BK20210451）

信息与控制

OA北大核心CSCDCSTPCD

ISSN：1002-0411

访问量0

下载量0

段落导航