|国家科技期刊平台
首页|期刊导航|中国电机工程学报|基于二阶随机动力学的多虚拟电厂自趋优能量管理策略

基于二阶随机动力学的多虚拟电厂自趋优能量管理策略OA北大核心CSTPCD

Energy Management Strategy for Multi-virtual Power Plants With Self-optimization Based on Second-order Stochastic Dynamics

中文摘要英文摘要

分布式资源(distributed energy resources,DERs)的随机元素会引起多虚拟电厂(multi-virtual power plant,MVPP)系统内虚拟电厂(virtual power plant,VPP)策略频繁变化.对于某主体,如何感知其他主体策略突然变化时对自身收益的影响趋势,并快速调整自身策略,是亟需解决的难点.该文提出基于二阶随机动力学的多虚拟电厂自趋优能量管理策略,旨在提升VPP应对其他主体策略变化时的自治能力.首先,针对DERs异质运行特性,聚焦可调空间构建VPP聚合运行模型;然后,基于随机图描绘VPP策略变化的随机特性;其次,用二阶随机动力学方程(stochastic dynamic equation,SDE)探索VPP收益结构的自发演化信息,修正其他主体策略变化时自身综合收益;再次,将修正收益作为融合软动作-评价(integrated soft actor-critic,ISAC)强化学习算法的奖励搭建多智能体求解框架.最后,设计多算法对比实验,验证了该文策略的自趋优性能.

The presence of numerous stochastic elements in distributed energy resources(DERs)leads to frequent changes in Multi-Virtual Power Plant(MVPP)when it comes to the strategy of individual VPPs.For a given entity,understanding the trend of the impact on its own returns when perceiving sudden changes in the strategies of other entities and rapidly adjusting its own optimization strategy is a critical issue that urgently needs to be addressed.This paper proposes a self-trending optimization strategy for MVPPs based on second-order stochastic dynamics,aiming to enhance the autonomy of VPPs in responding to changes in the strategies of other entities.First,addressing the heterogeneous operational characteristics of DERs,the paper focuses on the adjustable space of resources to construct a clustered operational model for VPP resources.Next,the stochastic nature of VPP strategy transitions is depicted based on the theory of random graphs.Then,second-order stochastic dynamic equations are used to explore its spontaneous evolutionary information to adjust the comprehensive profit of VPPs with the change of other entities'strategies.Moreover,the adjusted profit is used as the true reward function for the Integrated Soft Actor-Critic(ISAC)deep reinforcement learning decision model to establish a multi-agent distributed solution framework.Finally,multiple algorithm comparison experiments are designed to validate the self-trending performance of the proposed strategy in this paper.

陈嘉琛;陈中;李冰融;刘汶瑜;潘俊迪

东南大学电气工程学院,江苏省南京市 210018

动力与电气工程

多虚拟电厂自趋优聚合运行模型二阶随机动力学多智能体强化学习

multi-virtual power plantself-optimizationaggregate operation modelsecond-order stochastic dynamicsmulti-agent deep reinforcement learning

《中国电机工程学报》 2024 (016)

6294-6306 / 13

国家自然科学基金项目(52077035).Project Supported by National Natural Science Foundation of China(52077035).

10.13334/j.0258-8013.pcsee.232507

评论