电工技术学报2024,Vol.39Issue(5):1436-1453,18.DOI:10.19595/j.cnki.1000-6753.tces.230015
基于分层约束强化学习的综合能源多微网系统优化调度
Optimal Scheduling of Integrated Energy Multi-Microgrid System Based on Hierarchical Constraint Reinforcement Learning
摘要
Abstract
The optimization of the integrated energy multi-microgrid system is a complex task,with numerous variables and challenges including data privacy protection and uncertainties of power generation and load,posing significant challenges for the efficient implementation of traditional mathematical optimization methods.Recently,many scholars have turned their attention to deep reinforcement learning(DRL)methods,which rely on data-driven principles and exhibit strong adaptability to uncertainties of power generation and load.Nevertheless,the difficulty of convergence persists with increasing system scale,and traditional DRL methods that handle constraints by adding penalty terms to the reward function may obscure the boundary between objectives and constraints,making it difficult to ensure that constraints are fully satisfied and resulting in excessively conservative learning strategies or suboptimal solutions.To address these issues,this paper proposed a hierarchical constraint reinforcement learning optimization method. Firstly,this paper proposed a hierarchical DRL optimization framework for multi-microgrid systems.The proposed framework divides the optimization problem into two layers:an upper layer and a lower layer.The upper layer does not require obtaining all the operating status information of each microgrid.Instead,it utilizes net load prediction information and energy storage state information to provide energy storage optimization strategies and power interaction strategies.On the other hand,the lower layer enables each microgrid to autonomously optimize the output of its internal devices based on its own status information through mathematical programming,with the upper layer strategy as a constraint.The proposed framework leverages cooperation between the upper and lower layers to achieve overall optimization of the multi-microgrid system.This framework fully utilizes the advantages of DRL based on data-driven principles and effectively considers the solution accuracy of mathematical programming.Based on this hierarchical framework,a constraint DRL method is proposed that combines DRL methods with Lagrange multiplier methods.This method transforms the constraint optimization problem into an unconstrained optimization problem,enabling the agent to find the optimal strategy while strictly satisfying the constraints.Compared to traditional centralized optimization methods,the proposed method dynamically responds to the fluctuations of power generation and load to meet online optimization requirements and protects microgrid data privacy by not requiring the aggregation of all microgrid status information.Compared to general DRL methods,our approach effectively solves the problem of constraint violation and significantly improves both the convergence speed and accuracy. The following conclusions can be drawn from the case studies:(1)A hierarchical design approach is proposed to simplify the optimization of multi-microgrid systems.The approach does not require information exchange between microgrids and only necessitates uploading net load and energy storage state information.Microgrids can independently and parallelly solve the optimization problem based on their own status information.This approach can provide scheduling results in real-time consistent with the optimal solution when local status information is available.(2)The proposed approach combines data-driven principles with traditional methods,simplifying the complexity of action space and reward design.It effectively balances the rapid solving ability of DRL and the solution accuracy of mathematical programming.Compared to traditional DRL methods,the proposed approach significantly improves both convergence speed and accuracy.(3)The approach combines DRL methods with Lagrange multiplier methods to transform the constrained optimization problem into an unconstrained one.This ensures that the agent can find the optimal strategy while strictly satisfying the constraints.The approach avoids convergence difficulties and constraint violation issues caused by manually setting the penalty coefficient in traditional DRL methods.(4)The model exhibits robustness and can effectively adapt to the fluctuations of power generation and load,making rapid decisions on power interactions of each microgrid.关键词
多微网系统/分层约束强化学习/不确定性/数据隐私保护Key words
Multi-microgrid/hierarchical constraint reinforcement learning/uncertainty/data privacy protection分类
信息技术与安全科学引用本文复制引用
董雷,杨子民,乔骥,陈盛,王新迎,蒲天骄..基于分层约束强化学习的综合能源多微网系统优化调度[J].电工技术学报,2024,39(5):1436-1453,18.基金项目
国家重点研发计划(2020YFB0905900)和国家自然科学基金(52277098)资助项目. (2020YFB0905900)