沈阳工业大学学报2025,Vol.47Issue(1):29-36,8.DOI:10.7688/j.issn.1000-1646.2025.01.04
数据缺失情况下配电网时间序列数据分类算法
Time series data classification algorithm for distribution networks with missing data
摘要
Abstract
[Objective]In the context of the rapid development of smart grids,the effective management and analysis of data in the distribution network,as a key link in power transmission and distribution,is crucial for ensuring the stable operation of the power grid and improving the quality of power supply.However,the distribution network data are diverse and complex,covering multiple dimensions such as users'electricity consumption behavior,weather conditions,basic information of equipment,and marketing data.In the process of collecting and transmitting different types of data,data missing occurs due to interference such as magnetic field signals,noise signals,and redundant data,which not only increases the difficulty of monitoring the operation of the distribution networks but also brings great challenges to fault analysis,state assessment,and optimization decision-making.[Methods]To improve the accuracy and efficiency of data processing,this paper proposed a time series data classification algorithm for distribution networks with missing data.According to the distribution status of time series data in the distribution networks,a smoothing algorithm was used to remove data noise,significantly improving the accuracy and reliability of the data and optimizing the problems caused by redundant data interference.Incremental filling was carried out for missing data,and based on the inherent rules of time series data and the correlation between adjacent data points,reasonable speculation and filling were made for the missing data,maintaining the integrity of the data while ensuring the continuity and consistency of the time series.Calculations were conducted on the missing data of different time series,and the high-dimensional and low-dimensional data state spaces were combined with univariate and multivariate time series.By using dimension mapping,the dimensional factors of data were obtained,achieving intra-cluster classification.[Results]The experimental results show that the designed method filled in the data near the original data without redundancy,and the classification time points were evenly distributed,showing a linear trend,which fully demonstrates its efficient and stable data processing ability.After classification of the time series data of the distribution networks by using the designed method,the distribution network data of the same type were aggregated and did not interfere with each other.The noise data were significantly reduced,and the relative difference value(RDV)remained below 0.05.The specificity remained above 95.0%in the range of data missing rate from 5%to 35%,significantly higher than those(91.5%and 92.0%)of the counterparts.[Conclusion]The designed method effectively addresses the challenges posed by data missing and improves the accuracy and efficiency of data processing through techniques such as smooth denoising,incremental filling,and dimension mapping.At the same time,the advantages of the designed method in maintaining high classification accuracy and fast convergence speed were verified,which shows that it can effectively handle data missing situations and significantly improves the classification effect and operational stability of distribution network data.The research on this algorithm not only enriches the theoretical system of distribution network data analysis but also provides practical technical support for the operation and maintenance management of smart grids,which has important theoretical value and practical significance.关键词
数据缺失/配电网/维度映射/平滑算法/多元序列/数据分类/噪声干扰/维度因子Key words
missing data/distribution network/dimension mapping/smoothing algorithm/multivariate time series/data classification/noise interference/dimensional factor分类
信息技术与安全科学引用本文复制引用
萧展辉,张世良,邓丽娟,徐菡..数据缺失情况下配电网时间序列数据分类算法[J].沈阳工业大学学报,2025,47(1):29-36,8.基金项目
国家自然科学基金项目(61308394) (61308394)
南网数研院平台安全分公司数据中心管理体系研究项目(0002200000091292). (0002200000091292)