In this thesis, we sought to improve local temperature prediction with machine learning. We built three different forecasting models that combine the predictions of the global numerical weather prediction model ECMWF with current atmospheric measurements at a specific location as input.
Since numerical models are limited by spatial resolution and only predict at discrete grid points, we took four closest points of the ECMWF model predictions as attributes and combined them with measurements of different weather variables from a home weather station. We predicted hourly temperature up to 72 hours in advance and implemented three machine learning models: ridge regression, random forests and XGBoost. We tested the models with 72 datasets, where each dataset was constructed from the same attributes but different target variables, corresponding to the length of the forecast. Finally, the performance of the algorithms was evaluated using three regression metrics together with Friedman and Nemenyi statistical tests. We concluded that all three models improved the temperature prediction compared to the ECMWF model with XGBoost being the best performing model.
|