基于LightGBM-VIF-MIC-SFS的风电机组故障诊断输入特征选择方法Input feature selection method for wind turbine fault diagnosis based on LightGBM-VIF-MIC-SFS
马良玉,程东炎,梁书源,耿妍竹,段新会
摘要(Abstract):
针对风电机组数据采集与监视控制(SCADA)系统数据维数较高、特征冗余、特征相关性高导致风电机组的故障诊断过程存在误差大、分类正确率低的问题,提出一种基于LightGBM-VIF-MIC-SFS的三段式特征选择方法。首先,根据LightGBM实现对所有特征的重要性计算,确定初步特征空间;其次,根据方差膨胀因子(VIF)和最大信息系数(MIC)构建相关性判别阵,据此评估一次筛选中重要性相近的特征,舍弃相似性高的输入特征;最后,使用序列前向搜索法对特征进行第3次处理,逐个输入前2次特征选择获得的特征,保留能提升系统性能的特征,从而实现最终特征的选取。在完成了模型的建立后,使用风电场真实SCADA系统数据进行性能评估,将所提方法与2种对比算法在6个数据集上进行对比,结果显示所提出的LightGBM-VIF-MIC-SFS相较2种对比特征选择算法有显著优势。对所提方法内部的3个模块进行了消融实验,有效验证了所提特征选取方法内部各个模块的有效性以及基于所提方法得到的最优特征空间的合理性及准确性。
关键词(KeyWords): 风电机组;特征选择;LightGBM;方差膨胀因子;最大信息系数;序列前向搜索
基金项目(Foundation): 河北省中央引导地方科技发展资金项目(226Z2103G)~~
作者(Author): 马良玉,程东炎,梁书源,耿妍竹,段新会
DOI: 10.19666/j.rlfd.202306123
参考文献(References):
- [1]徐进,汤海宁,丁显.基于改进GRU的海上风电机组齿轮箱故障诊断[J].船舶工程, 2022, 44(9):167-173.XU Jin, TANG Haining, DING Xian, et al. Fault diagnosis of offshore wind turbine gear box based on improved GRU[J]. Ship Engineering, 2022, 44(9):167-173.
- [2]孙群丽,周瑛,刘长良.基于LARS特征选择的风电机组故障诊断的研究[J].可再生能源, 2020, 38(10):1349-1354.SUN Qunli, ZHOU Ying, LIU Changliang. Research on fault diagnosis of wind turbines based on LARS feature selection[J]. Renewable Energy, 2020, 38(10):1349-1354.
- [3]孙文卿,邓艾东,邓敏强,等.基于模型融合的风电机组齿轮箱故障诊断[J].太阳能学报, 2022, 43(1):64-72.SUN Wenqing, DENG Aidong, DENG Minqiang, et al.Fault diagnosis of wind turbine gearbox based on model fusion[J]. Acta Energiae Solaris Sinica, 2022, 43(1):64-72.
- [4]马良玉,袁乃正.基于CFSFDP与LightGBM的风电机组异常状态预警研究[J].太阳能学报, 2023, 44(5):401-406.MA Liangyu, YUAN Naizheng. Research on abnormal condition early warning for wind turbine based on CFSFDP and LightGBM[J]. Acta Energiae Solaris Sinica,2023, 44(5):401-406.
- [5]马永光,冯勇升.基于IICEEMDAN-PCA-GRU的风电机组齿轮箱故障预警方法研究[J].太阳能学报, 2023,44(4):67-73.MA Yongguang, FENG Yongsheng. Research on fault warning method of wind turbine gearbox based on IICEEMDAN-PCA-GRU[J]. Acta Energiae Solaris Sinica, 2023, 44(4):67-73.
- [6]符杨,周全,贾锋,等.基于SCADA数据图形化的海上风电机组故障预测[J].中国电机工程学报, 2022,42(20):7465-7475.FU Yang, ZHOU Quan, JIA Feng, et al. Fault prediction of offshore wind turbines based on graphical processing of SCADA data[J]. Proceedings of the CSEE, 2022, 42(20):7465-7475.
- [7]朱俊杰,任鑫,郝延,等.风电机组故障知识的获取表达与推理框架[J].热力发电, 2023, 52(3):73-80.ZHU Junjie, REN Xin, HAO Yan, et al. Acquisition,expression and reasoning framework of wind turbine fault knowledge[J]. Thermal Power Generation, 2023, 52(3):73-80.
- [8]汪臻,邓巍,赵勇,等.风电机组主轴总成窜动监测与故障预警[J].热力发电, 2022, 51(12):141-148.WANG Zhen, DENG Wei, ZHAO Yong, et al. Monitoring and fault warning of main shaft assembly runout of wind turbine[J]. Thermal Power Generation, 2022,51(12):141-148.
- [9]史志刚,冯铁玲,刘雪峰,等.某风电机组主轴断裂原因分析[J].热力发电, 2022, 51(12):186-192.SHI Zhigang, FENG Tieling, LIU Xuefeng, et al. Cause analysis of main shaft fracture of a wind turbine[J].Thermal Power Generation, 2022, 51(12):186-192.
- [10]齐咏生,单成成,高胜利,等.基于AEWT-KELM的风电机组轴承故障诊断策略[J].太阳能学报, 2022,43(8):281-291.QI Yongsheng, SHAN Chengcheng, GAO Shengli, et al.Fault diagnosis strategy of wind turbines bearing based on AEWT-KELM[J]. Acta Energiae Solaris Sinica, 2022,43(8):281-291.
- [11]李东东,赵阳,赵耀,等.基于深度特征融合网络的风电机组行星齿轮箱故障诊断方法[J].电力系统保护与控制, 2022, 50(10):1-10.LI Dongdong, ZHAO Yang, ZHAO Yao, et al. A fault diagnosis method for a wind turbine planetary gear box based on a deep feature fusion network[J]. Power System Protection and Control, 2022, 50(10):1-10.
- [12]兰孝升,李云凤,苏元浩,等.基于关联度与自检验长短期记忆网络的风电机组轴承寿命预测模型[J].高电压技术, 2023, 49(6):2652-2661.LAN Xiaosheng, LI Yunfeng, SU Yuanhao,et al. Wind turbine bearing life prediction model based on indexed relation and self-checking long short-term memory[J].High Voltage Engineering, 2023, 49(6):2652-2661.
- [13]刘灏,商峻,毕天姝,等.基于实测数据的电网频率信号特征分析与提取方法[J].电力系统自动化, 2023,47(10):135-144.LIU Hao, SHANG Jun, BI Tianshu, et al. Feature analysis and extraction method of power grid frequency signal based on measured data[J]. Automation of Electric Power Systems, 2023, 47(10):135-144.
- [14]曾祥军,冯琛,杨明,等.考虑运行状态相似性的风电机组数据异常检测方法[J].电力系统自动化, 2022,46(11):170-180.ZENG Xiangjun, FENG Chen, YANG Ming, et al. Data anomaly detection method for wind turbines considering operation state similarity[J]. Automation of Electric Power Systems, 2022, 46(11):170-180.
- [15]甄志龙,张居晓.卡方统计中基于KL散度的高维文本数据特征筛选[J].统计与决策,2022,38(17):43-46.ZHEN Zhilong, ZHANG Juxiao. Feature screening for high dimensional text data based on kl divergence in chisquared statistics[J]. Statistics&Decision, 2022, 38(17):43-46.
- [16]刘献礼,秦怡源,岳彩旭,等.递归特征消除与极端随机树在铣刀磨损监测中的研究[J].机械科学与技术,2023, 42(6):821-828.LIU Xianli, QIN Yiyuan, YUE Caixu, et al. Research on recursive feature elimination and extra trees in milling cutter wear monitoring[J]. Mechanical Science and Technology for Aerospace Engineering, 2023, 42(6):821-828.
- [17]李汪繁,丁先,方晶剑.基于GWO-RF的凝汽器真空预测方法[J].动力工程学报, 2023, 43(4):436-442.LI Wangfan, DING Xian, FANG Jingjian. Prediction method of condenser vacuum based on GWO-RF[J].Journal of Chinese Society of Power Engineering, 2023,43(4):436-442.
- [18]彭道刚,姬传晟,涂煊,等.基于LSTM-SVM的燃气轮机压气机故障预警研究[J].动力工程学报, 2021,41(5):394-399.PENG Daogang, JI Chuansheng, TU Xuan, et al. Research on gas turbine compressor fault early warning based on LSTM-SVM[J]. Journal of Chinese Society of Power Engineering, 2021, 41(5):394-399.
- [19]贾凯,江明,袁啸林,等.基于代价敏感型LightGBM的分子泵故障检测[J].电子测量与仪器学报, 2022,36(10):55-64.JIA Kai, JIANG Ming, YUAN Xiaolin, et al. Fault detection of molecular pump based on cost sensitive LightGBM[J]. Journal of Electronic Measurement and Instrumentation, 2022, 36(10):55-64.
- [20] FRIEDMAN J H. Greedy function approximation:a gradient boosting machine[J]. The Annals of Statistics,2001, 29(5):1189-1232.
- [21]朱佳慧,于丽英.我国科技创新与金融发展的耦合协同测度——基于VIF-变异系数的筛选[J].上海大学学报(自然科学版), 2021, 27(4):785-794.ZHU Jiahui, YU Liying. Coupling synergy measure of scitech innovation and financial development in China:screening based on VIF-variation coefficient[J]. Journal of Shanghai University(Natural Science Edition), 2021,27(4):785-794.
- [22]崔树银,汪昕杰.基于最大信息系数和多目标Stacking集成学习的综合能源系统多元负荷预测[J].电力自动化设备, 2022, 42(5):32-39.CUI Shuyin, WANG Xinjie. Multivariate load forecasting in integrated energy system based on maximal information coefficient and multi-objective Stacking ensemble learning[J]. Electric Power Automation Equipment, 2022,42(5):32-39.
- [23]姚锐,惠萌,李俊,等.基于随机森林的局部放电特征提取和优选研究[J].华北电力大学学报(自然科学版),2021, 48(4):63-72.YAO Rui, HUI Meng, LI Jun, et al. Feature extraction and optimal selection based on random forest for partial discharges[J]. Journal of North China Electric Power University(Natural Science Edition), 2021, 48(4):63-72.
- [24]张雪峰,杜孝平,王晓健,等.基于引力搜索机制的数据聚类及特征选择算法[J].计算机工程与设计, 2021,42(9):2536-2544.ZHANG Xuefeng, DU Xiaoping, WANG Xiaojian, et al.Data clustering and feature selection algorithm based on gravitational search mechanism[J]. Computer Engineering and Design, 2021, 42(9):2536-2544.
- [25]李庚松,刘艺,郑奇斌,等.基于多目标混合蚁狮优化的算法选择方法[JL].计算机研究与发展, 2023, 60(7):1533-1550.LI Gengsong, LIU Yi, ZHENG Qibin, et al. Algorithm selection based on multi-objective hybrid ant lion optimizer[J]. Journal of Computer Research and Development, 2023, 60(7):1533-1550.