Air temperature is one of the quintessential variables in meteorology, and being able to predict it reaps many benefits. In times of climate change, global temperatures are rising. The number of heat waves is increasing, which can have major consequences for agricultural production, for the urban population, and many other phenomena. For use cases such as these, a precise forecast of this weather variable is of key importance.
Comparing different model approaches (2017)
In this study, the accuracy of different model approaches are compared - forecast data of 5 different raw weather models, forecast data calculated by the meteoblue Learning MultiModel (mLM) and model output statistics (MOS), and the reanalysis model ERA5. These forecasts have been verified against hourly measurements of more than 11'000 observation sites worldwide. The analysis was done for 2017, and partially for September - October 2018.
|Model approach||MAE [K]|
|meteoblue Learning MultiModel (mLM)||1.2 K|
|Model output statistics (MOS)||1.5 K|
|Reanalysis model ERA5||1.5 K|
|5 different weather forecast models (RAW)||1.7 - 2.2 K|
The following table shows the mean absolute error (MAE) in Kelvin for the different model approaches. The mLM approach demonstrated the lowest MAE, followed by the MOS and ERA5. These model approaches perform better than the raw models.
Comparing different global models (2018)
An additional verification study to evaluate the performance of different global models was conducted in a separate bachelor thesis (Fessler, 2019). 24-hour 2 m air temperature forecast of the year 2018 of the forecast models NEMS, GFS05, MFGLOBAL, GEM and ICON were compared to hourly measurements of more than 8000 stations from the WMO (World Meteorological Organization) and GDAS (Global Data Assimilation System) from NOAA (National Oceanic and Atmospheric Administration), distributed worldwide. In addition to that, historical data of the reanalysis model ERA5 were included, in which several statistical error metrics were calculated and compared. Furthermore, the global patterns of accuracy variability were examined.
Raw model comparison
|Model||MAE [K]||MBE [K]|
Simple error comparison allows for the evaluation of the performance between different models. The following findings (see table below) confirm the previous study's results. The reanalysis model ERA5, with an MAE of 1.5 K, outperforms all raw models examined in this study, followed by ICON and NEMSGLOBAL. ERA5, NEMSGLOBAL and GFS05 tend to predict the temperature as higher than it actually occurs, while MFGLOBAL, GEM and ICON predict the temperature as lower than as described by hourly measurements.
Across all weather stations, ERA5 performs the best and has the lowest MAE (see figure below).
The spatial distribution of the MAE for the reanalysis model ERA5 and NEMS is visualised in the following world maps. To avoid overlapping of plot points from different stations, the globe was divided into model grid cells with a horizontal resolution of 2°. In other words, the MAE for all stations within one of the grid cells were first merged and then plotted in the center of the grid-field. The main distribution of the model error between both models is comparable, however note that NEMS has a shifted error range. In general, we can identify higher errors in the Rocky Mountains, India, China, and Eastern Russia. Good performance can be observed in northern Europe, North America, Australia, Western Russia, and Africa.
The air temperature forecast has the highest accuracy on small oceanic islands and along ice-free coasts. In these regions, the air temperature is strongly influenced by the sea surface temperature. High accuracy and predictability over Europe and North America can be explained by the fact that weather forecast models were developed in these regions. Another fact worth mentioning (and not covered in the studies) is that air temperature is typically simulated worse in the northern hemispheric winter than in summer. Furthermore, the results show that the accuracy decreases in regions with complex topography such as the Rocky Mountains, the Himalayas, or the Andes, and with the increasing distance from the sea. Therefore, continental regions and regions with high elevation are typically simulated worse than maritime and low-elevated regions.
Provider comparison (2021)
|Provider||MAE [°C]||MBE [°C]|