信息与控制2023,Vol.52Issue(6):773-783,810,12.DOI:10.13976/j.cnki.xk.2023.2488
基于双行动者深度确定性策略梯度算法的间歇过程控制
Batch Process Control Based on Twin-actor Deep Deterministic Policy Gradient Algorithm
摘要
Abstract
We propose a batch process control scheme without a process model by combining reinforce-ment learning(RL)to solve the problem that conventional model-based control methods have inac-curate models because of their complex nonlinear dynamics when dealing with batch process tasks,which affects control performance.First,the method solves the problem of high estimation of the value function in deep RL algorithms by the structure of twin-actor parallel training to improve the learning efficiency of the algorithm.Second,an independent experience pool is established for each actor to maintain the independence of the twin actors.Furthermore,a novel reward function is established for the RL controller to guide the process back to the predetermined trajectory;we mitigate the temporal difference(TD)error accumulation problem in parameter updating by intro-ducing a delayed policy update method.Finally,the effectiveness of the controller based on the twin-actor deep deterministic policy gradient algorithm for batch process control is demonstrated by simulating the penicillin fermentation process.关键词
间歇过程/模型未知/强化学习/行动者-评论家框架/策略梯度算法Key words
batch process/model unknown/reinforcement learning/actor-critic framework/policy gradient algorithm分类
计算机与自动化引用本文复制引用
马军伟,徐琛,陶洪峰,杨慧中..基于双行动者深度确定性策略梯度算法的间歇过程控制[J].信息与控制,2023,52(6):773-783,810,12.基金项目
国家自然科学基金(62103167) (62103167)
江苏省自然科学基金(BK20210451) (BK20210451)