基于安全强化学习的主动配电网有功-无功协调优化调度OA北大核心CSTPCD
Coordinated Optimization of Active and Reactive Power of Active Distribution Network Based on Safety Reinforcement Learning
提出一种基于离线策略的安全强化学习方法,通过离线训练大量配电网历史运行数据,摆脱了传统优化方法对完备且准确模型的依赖.首先,结合配电网络参数信息,建立了基于约束马尔可夫决策过程的有功无功优化模型;其次,基于原始对偶优化法设计了新型安全强化学习方法,该方法在最大化未来折扣奖励的同时最小化成本函数;最后,在配电系统上进行仿真.仿真结果表明:所提方法能够根据配电网实时观测信息,在线生成满足复杂约束条件且具有经济效益的调度策略.
A safe reinforcement learning method based on offline strategies is proposed.Through offline training of a large amount of historical operating data of the distribution network,it gets rid of the traditional optimization method.Dependence on complete and accurate models.First,combined with the distribution network parameter information,an active and reactive power optimization model based on the constrained Markov decision process(CMDP)was established;then,a new safety reinforcement learning method was designed based on the original dual optimization method.The cost function is minimized while maximizing future discount rewards;finally,simulations are performed on power distribution system.The simulation results show that the proposed method can online generate a dispatching strategy that satisfies complex constraints and has economic benefits based on real-time observation information of the distribution network.
焦昊;殷岩岩;吴晨;刘建;徐春雷;徐贤;孙国强
国网江苏省电力有限公司电力科学研究院,江苏南京 211103河海大学电气与动力工程学院,江苏南京 211100国网江苏省电力有限公司,江苏南京 210024
主动配电网有功无功协调优化安全强化学习
active distribution networkactive and reactive power coordination optimizationsafety reinforcement learning
《中国电力》 2024 (003)
43-50 / 8
国家自然科学基金资助项目(U1966205);国网江苏省电力有限公司科技项目(J2023121). This work is supported by National Natural Science Foundation of China(No.U1966205)and Science and Technology Project of State Grid Jiangsu Electric Power Co.,Ltd.(No.J2023121).
评论