A novel encoding technique to improve non-weather-based models for solar photovoltaic forecasting
(1) American International School of Dhaka, Dhaka, Bangladesh, (2) Duke University, Durham, North Carolina, (3) Oxford University, Oxford, England, United Kingdom
https://doi.org/10.59720/22-087Several studies have applied different machine learning (ML) techniques to the area of forecasting solar photovoltaic power production. Most of these studies use weather data as inputs to predict power production; however, there are numerous practical issues with the procurement of this data. This includes the high costs of procurement and lack of backup techniques if communication with weather data services fail. These practical issues are not widely considered yet in the current literature. This study proposes models that do not use weather data as inputs, but rather use past power production data as a more practical substitute to weather-based models. Similar studies have shown satisfactory accuracies, but this study proposes a novel data preprocessing technique—cyclical features encoding—that we hypothesized would boost model accuracy significantly. We used ML techniques to predict power production in a 24-hour time horizon, using input data of the past 48 hours of power production. The Random Forest model offered the best results, with a Pearson Correlation Coefficient of 0.97 (11% higher than previous studies), Mean Absolute Error of 0.0266 (60% better than previous studies), and Root Mean Squared Error of 0.0773 (38% better than previous studies). These results are comparable to state-of-the-art weather models in the field. Our proposed models demonstrate a better, cheaper, and more reliable alternatives to current weather models.
This article has been tagged with: