Wind power has recently started gaining an important role in the energy mix. Several utility
companies have switched to wind power to meet most of their demand. However, there is one
big problem associated with wind power - intermittency. As a result utility companies have to
maintain backup power which is generated using coal. This makes it expensive and dirty. In
order to maintain an optimal level of backup power, the utility companies must have accurate
wind power forecasts available. This is the problem that we will try to address.
Description of Data
The dataset has been obtained from kaggle.com and contains hourly wind power forecasts
(normalized) from July 1, 2009 to July 31, 2012 and contains more than 1,00,000 rows. In
addition to wind power, we are also given a forecast for wind speed and wind direction by the
meteorological department. We expected seasonality in the data but when we plotted the series
month-wise and hour-of-day wise, we found that there is no seasonality in the data (Appendix:
Fig 1). Next, we plotted the data for different months on a day-of-hour basis to check for daily
seasonality, and even in this case, we did not find any seasonality (Appendix: Fig 2). The last
time series component that we checked for was trend. In this case, we found that the trend was
not a typical trend like linear or exponential, but a combination of several sinusoids of different
frequencies (Appendix: Fig 3).
Conclusion & Recommendations
We found the following model to be optimal: Multiple Linear Regression (with average wind
speed and direction as predictors), followed by another MLR run on the residuals with Lag 24
and Lag 48 as predictors. We used RMSE to compare different models (reason provided later).
Based on the results, we found that our model captures the trend of the data really well, and if we
provide prediction intervals (say 95%), our clients can choose from multiple values and with
experience they can get better results using the model.