Verification studies

The forecast of precipitation in numerical weather forecast models is increasingly gaining economic importance for various applications such as irrigation, agriculture, hydropower generation or flood warning.

These numerical weather forecast models have been continuously improved in the last decades. Within our verification studies, we analyze the model performance of the precipitation by using several raw numerical weather forecast models and the meteoblue MultiModel forecast. Model results were compared with precipitation measurements from up to 10’000 meteorological stations from all over the world. Additionally, we conducted a competition analysis, in which we compared our forecast accuracy to the forecast accuracy of other weather providers.

meteoblue conducts these extensive long-term verification studies in order to understand the quality of the precipitation data, produced by its own models as well as by various third-party operators, compared to several thousand precipitation measurement stations.

Throughout the studies, we focus on the historical annual precipitation of the weather stations, which is of highest priority for customers e.g., in the agricultural sector. For the forecast, the increase in the model performance of single events (e.g. daily precipitation > 1 mm) is more important than the correct absolute values of the annual precipitation.

In all verification studies plausibility tests for the measurements have been conducted in order to ensure the quality of the measurements. Measurement data with more than 30% gaps were excluded from the analysis to ensure robust results. Additionally, a quality control for the measurements was applied to exclude erroneous measurements.

Several meteoblue studies focusing on precipitation verification have been published by meteoblue within the last 5 years. The following main findings are summarised below:

  • Results between different studies might vary due to a different number of stations, different analysed time periods or different measurement data provider
  • Finer horizontal resolution in the model does not improve the model performance (compare NEMSGLOBAL 30 km grid resolution, with NEMS12 (12 km) and NEMS4 (4 km)
  • The error metrics are very similar when comparing models for different years for many stations (compare verification study 2017 and verification study 2019)
  • Error metrics could vary significantly when choosing different continents or even smaller regions (see differences in the HSS for North America and the entire world in the competition analysis)
  • The meteoblue precipitation forecast is as good as the reanalysis model ERA5.
  • Post-processing methods used for meteoblue forecast, such as multimodel-mixing or the mLM (meteoblue Learning MulitModel), increase the precipitation model performance.
  • We recommend the use of the meteoblue MultiModel for the operational forecast because daily precipitation events as well as annual precipitation sums are satisfactorily reproduced.

Summary of HSS (Heidke Skill Score), MAE (mean absolute error) and MBE (mean bias error) for four different studies
Study Number of stations Model domain Time period Measurements Model HSS (Daily Precipitation > 1mm MAE [mm] MBE [mm]
Verification study 2017 6505 worldwide 2017 METAR ERA5 0.45 175 - 7
NEMSGLOBAL 0.42 228 - 100
meteoblue forecast 0.47 161 - 8
GFS 0.42 235 + 62
CHIRPS2 0.30 120 + 33
Verification study 2019 8112 worldwide 2019 METAR ERA5 0.45 234 - 6
GFS 0.43 270 84
NEMSGLOBAL 0.41 296 - 144
ICON 0.46 253 - 51
Competition analysis 100 North America Jan-Jul 2021 METAR meteoblue forecast 0.61 42 + 33
500 worldwide GSOD meteoblue forecast 0.44 135 + 7
NEMS verification 1605 Europe Jan-Jul 2021 GSOD NEMS4 0.41 119 - 64
NEMS12 0.41 85 - 36

Details about the individual studies can be found in the following two sections:

Daily Precipitation Events

Find out more about our verification studies based on precipitation events.

Yearly Precipitation Sums

Find out more about our verification studies based on precipitation amount.