TY - JOUR
T1 - Assessing the Impact of Features on Probabilistic Modeling of Photovoltaic Power Generation
AU - Yamamoto, Hiroki
AU - Kondoh, Junji
AU - Kodaira, Daisuke
N1 - Publisher Copyright:
© 2022 by the authors.
PY - 2022/8
Y1 - 2022/8
N2 - Photovoltaic power generation has high variability and uncertainty because it is affected by uncertain factors such as weather conditions. Therefore, probabilistic forecasting is useful for optimal operation and risk hedging in power systems with large amounts of photovoltaic power generation. However, deterministic forecasting is the mainstay of photovoltaic generation forecasting; there are few studies on probabilistic forecasting and feature selection from weather or time-oriented features in such forecasting. In this study, prediction intervals were generated by the lower upper bound estimation (LUBE) using neural networks with two outputs to make probabilistic modeling for predictions. The objective was to improve prediction interval coverage probability (PICP), mean prediction interval width (MPIW), continuous ranked probability score (CRPS), and loss, which is the integration of PICP and MPIW, by removing unnecessary features through feature selection. When features with high gain were selected by random forest (RF), in the modeling of 14.7 kW PV systems, loss improved by 1.57 kW, CRPS by 0.03 kW, PICP by 0.057 kW, and MPIW by 0.12 kW on average over two weeks compared to the case where all features were used without feature selection. Therefore, the low gain features from RF act as noise and reduce the modeling accuracy.
AB - Photovoltaic power generation has high variability and uncertainty because it is affected by uncertain factors such as weather conditions. Therefore, probabilistic forecasting is useful for optimal operation and risk hedging in power systems with large amounts of photovoltaic power generation. However, deterministic forecasting is the mainstay of photovoltaic generation forecasting; there are few studies on probabilistic forecasting and feature selection from weather or time-oriented features in such forecasting. In this study, prediction intervals were generated by the lower upper bound estimation (LUBE) using neural networks with two outputs to make probabilistic modeling for predictions. The objective was to improve prediction interval coverage probability (PICP), mean prediction interval width (MPIW), continuous ranked probability score (CRPS), and loss, which is the integration of PICP and MPIW, by removing unnecessary features through feature selection. When features with high gain were selected by random forest (RF), in the modeling of 14.7 kW PV systems, loss improved by 1.57 kW, CRPS by 0.03 kW, PICP by 0.057 kW, and MPIW by 0.12 kW on average over two weeks compared to the case where all features were used without feature selection. Therefore, the low gain features from RF act as noise and reduce the modeling accuracy.
KW - feature selection
KW - lower upper bound estimation
KW - photovoltaic generation forecasting
KW - probabilistic forecasting
KW - random forest
UR - https://www.scopus.com/pages/publications/85137002568
U2 - 10.3390/en15155337
DO - 10.3390/en15155337
M3 - Article
AN - SCOPUS:85137002568
SN - 1996-1073
VL - 15
JO - Energies
JF - Energies
IS - 15
M1 - 5337
ER -