电工技术学报2025,Vol.40Issue(11):3667-3679,13.DOI:10.19595/j.cnki.1000-6753.tces.240795
考虑样本不平衡条件下风机叶片覆冰诊断及其可解释性研究
Diagnosis and Interpretability Study of Wind Turbine Blade Icing under Consideration of Sample Imbalance Conditions
摘要
Abstract
The low temperature and high humidity environment in winter can easily cause wind turbine blades to freeze,seriously affecting the actual power output and safe operation of wind turbines.To avoid problems such as increased fatigue load and vibration of unit components caused by icing,wind farms need to implement shutdown strategies in a timely manner based on the icing situation of the blades.Therefore,accurate identification of blade icing status has become one of the key points in maintaining the safe operation of winter wind turbines.However,current ice diagnosis methods rely on a large amount of time series data for modeling and prediction.In practical work,due to equipment and working conditions,it is difficult to collect sufficient ice sample monitoring data,which leads to the widespread problem of data imbalance and has a continuous impact on the improvement of ice diagnosis accuracy.To solve this problem,this paper proposes a fusion diagnostic model based on conditional generative adversarial network(CTGAN)and light gradient boosting machine(LightGBM),aiming to achieve high-performance wind turbine blade ice diagnosis using a small number of training samples. Firstly,based on the sliding window algorithm,new mixed features are further constructed on the basis of the original features.Secondly,the CTGAN model is used to learn the data distribution of real samples,and Nash equilibrium is achieved through adversarial training with generators and discriminators,generating new samples that are similar to real samples.Then,the synthesized samples are input into LightGBM to extract effective features and diagnose icing,and the LightGBM model is modified by introducing a focus loss function to improve its ability to distinguish confusing samples.Finally,the attribution theory based on shapley additive explanetions(SHAP)was used to analyze the factors affecting icing. The simulation results on actual wind farm data show that the diagnostic accuracy of all algorithms has a certain improvement effect after using mixed features,and the average diagnostic accuracy of each model can reach 0.979.Due to the introduction of sample expansion algorithms,the accuracy of each model has improved to varying degrees compared to when data is lacking.When the sample imbalance rate is 30%,the accuracy of the traditional Logistic regression classification model is improved by 11.02%.At the same time,the accuracy of LightGBM(Focal Loss)is 0.982,which is close to the accuracy when the sample is sufficient.As the sample imbalance rate decreases and the actual number of ice-covered samples further decreases,the advantages of the sample expansion algorithm gradually become apparent.When the sample imbalance rate is 10%,compared to the unexpanded samples,the accuracy of Logistic regression model is improved by 13.55%.When the sample imbalance rate is 5%and the actual number of ice-covered samples is only 15,compared to the unexpanded samples,the accuracy of Logistic regression,KNN,XGBoost,and LightGBM models has improved by 35.85%,4.52%,9.32%,and 9.18%,respectively.This indicates that CTGAN has good sample generation ability and can effectively learn the distribution of real samples even when the sample data is small. From the simulation analysis,the following conclusions can be drawn:(1)The mixed features constructed based on the sliding window algorithm in this paper can significantly improve the classification ability of each model.At the same time,the LightGBM model combined with mixed feature information has obvious advantages compared to other models.(2)The sample generation model CTGAN can effectively learn the distribution of real samples,and compared to other data augmentation methods,it can generate new samples that are more similar to real samples.(3)By using the Focal loss function to modify the LightGBM model,the model's ability to distinguish easily confused samples has been increased.In addition,based on the SHAP attribution theory,the importance of each icing factor was analyzed,and the quantitative impact of key features on the diagnostic results was quantified,improving the credibility of the model's diagnostic results.关键词
风机叶片/样本不平衡/生成对抗网络/覆冰诊断/可解释性Key words
Wind turbine blade/sample imbalance/generative adversarial networks/icing diagnosis/interpretability分类
信息技术与安全科学引用本文复制引用
吕云龙,胡琴,胡紫园,武雨凡,林晖尧..考虑样本不平衡条件下风机叶片覆冰诊断及其可解释性研究[J].电工技术学报,2025,40(11):3667-3679,13.基金项目
中央高校基本科研业务费(2023CDJYXTD-005)和国家资助博士后研究人员计划(GZC20242120)资助项目. (2023CDJYXTD-005)