AIREN-NWP Revolutionizes Short-Term Weather Forecasting
Is there a high variance in weather forecasts for today? Traditional numerical weather prediction (NWP) models are limited in their ability to integrate new observations. Enter AIREN-NWP, AI-powered NWP post-processing that fuses forecasts with real-time measurements for higher resolution and more accurate weather predictions up to 6 hours ahead. We have observed up to 2.5x improvement in the initial forecasts for precipitation, temperature, and wind gusts. Get ready for forecasts that react fast, stay sharp, and keep you ahead of the weather.
Introduction
Remember the chaos of last summer's flash floods? Or the unpredictable gusts that wrought havoc during the fall festival? These are just a few examples of severe weather events where quick decision-making is crucial to ensuring the safety of people and property. However, the high unpredictability of these types of weather leaves traditional forecasts inadequate. As these forecasts give only the general likelihood of risks in a larger region, issuing actionable warnings based on them is not viable. At Meteopress, we have already developed AIREN-Nowcasting for more granular predictions, which infers real-time, high-resolution, accurate, and specific nowcasts. Still, a longer prediction horizon may be necessary to inform the interested parties sufficiently, and variables other than precipitation can also be important.
Enter AIREN-NWP, a novel solution combining the strengths of rapidly-updated AI nowcasts with meteorological expertise and robustness of NWP. It uses data like synoptic scale meteorological station measurements (from here on referred to as SYNOP) as well as radar and satellite imagery to post-process NWP forecasts, delivering:
- Accuracy - Enhancing accuracy by correcting biases in NWP forecasts and integrating the latest real-time data from diverse sources.
- Resolution - Providing more detailed predictions, like a 1-hour time step improved from the 3-hour step of the input GFS data.
- Updates - Computing new, more accurate predictions as often as the relevant data real-world update becomes available.
AIREN-NWP can be tailored to use any NWP model and any combination of relevant input data, empowering the recipients of its predictions to navigate even the most unpredictable weather scenarios with greater confidence and readiness. In the following sections, we will delve deeper into the technical details of AIREN-NWP's architecture and data sources used in this study, shedding light on the mechanisms behind these advancements and providing detailed performance analysis.
Problem Background
While NWP models offer valuable insights into long-term weather patterns, their limitations become particularly evident in highly unpredictable scenarios. High computational power requirements restrict the forecasts' spatial and temporal resolution while limiting the frequency of model updates. Moreover, the first hours of the forecast suffer from the "spin-up problem." The accuracy of the forecast additionally suffers in the initial time period until the inconsistency between the meteorological inputs and numerical model tapers out. Thus, it is challenging for NWP models to utilize all available data effectively, potentially leaving us guessing about the immediately following weather development.
On the opposite end of the spectrum lie AI-powered solutions like AIREN-Nowcasting. They excel at short-term, high-resolution predictions by leveraging real-time radar data. While the highly accurate and granular radar data allows AIREN-Nowcasting to deliver precipitation forecasts for the next 90 minutes with unprecedented accuracy, this approach has drawbacks, too. Relying solely on extrapolation from a single data source leads to a deterioration of prediction performance beyond the near time horizon, roughly 2 hours. These forecasts are great for immediate decision-making (e.g., taking shelter during storm events or choosing the right time to depart from work to avoid an unwanted shower). However, the 90-minute prediction horizon can be limiting. Additionally, the focus on radar data restricts AIREN-Nowcasting scope to precipitation, excluding other important weather variables like temperature and wind gusts.
Data and Solution
As discussed in the previous section, both NWP models and purely AI-based nowcasting approaches have limitations that hinder their ability to provide comprehensive and reliable weather forecasts. AIREN-NWP addresses these limitations by combining the strengths of both approaches, leveraging a recurrent neural network to enhance the original NWP forecasts. The network architecture is an extension of the architecture developed by DataLab at the Faculty of Information Technology, CTU in Prague, during a joint project. In it, the real-time measurements are used to initialize the hidden state of the recurrent cell, priming it for step-by-step post-processing of input forecasts and metadata. The objective is to post-process the NWP forecasts to match the ground truth weather reanalysis as closely as possible. With the dataset spanning from October 2015 to December 2022, we reserve two full years (2019 and 2022) for validation, ensuring robust and fair performance evaluation.
As the starting point, AIREN-NWP utilizes Global Forecast System (GFS) forecasts for the Central Europe area. These forecasts are issued four times daily at six-hour intervals, with a 0.25-degree spatial and 3-hour temporal resolution. ERA5 reanalysis serves as the ground truth, matching the spatial resolution but providing hourly data. To achieve this 1-h output resolution, for each recurrent step, we combine the next nearest 3-hour GFS forecast with metadata reflecting how far into the future the current prediction is. We currently focus on three target variables and predict them separately - accumulated precipitation (APCP), temperature 2 meters above the surface (TMP), and velocity of wind gusts (GUST).
The first of the three real-time data sources that initialize the recurrent cell are SYNOP station measurements. They are available in 1-hour intervals, and the last three are used. SYNOP measurements, originally local point data, are interpolated to a more continuous raster-like representation. The measured variables are air pressure, humidity, temperature, wind speed, and precipitation. It should be noted that the SYNOP stations, even inside just the central European area, provide their data at somewhat arbitrary intervals and locally specific availability. Yet, AIREN-NWP demonstrates high robustness even in the case of random outages of the individual stations.
Radar reflectivity data spans the last hour in 10-minute intervals. Its spatial resolution, eight times the resolution of GFS, is downscaled using a convolutional encoder. Unfortunately, this experiment's radar dataset only contains sufficiently reliable data for the area of France and the Czech Republic, making it necessary to inform the model about these local limitations in the metadata. Training loss is weighted toward radar-covered areas to motivate the use of radar for predicting APCP, and evaluation is performed solely for these areas.
Similarly to radar, satellite data covers the last hour in 10-minute intervals. We have selected seven infrared channels based on a statistical quality check. This data is four times the spatial resolution of GFS, requiring another convolutional encoder for downscaling.
We have decided on this data selection based on extensive experimentation (detailed in the following section). Each of the three variables is predicted separately, allowing for different configurations and straightforward inclusion of other predicted outputs in the future. Notably, APCP is forecasted for 3 hours into the future, while TMP and GUST forecasts reach 6 hours. This tailored approach leverages the strengths of each available data source to optimize prediction accuracy for different weather variables.
Results
Following the objective of AIREN-NWP to post-process NWP forecasts, we focus on the Relative Improvement of MAE (mean absolute error). This metric indicates by what factor the MAE of AIREN-NWP forecasts is lower than that of a baseline. ERA5 reanalysis is used as the reference for MAE computation of both the examined model and the baseline. The relative improvement should increase towards the event - as the forecast time draws nearer.
The most essential baseline is the input GFS, which has a 3-h prediction step, in contrast to the 1-h step of AIREN-NWP. The most recent AIREN-NWP predictions are aggregated for these comparisons to match the GFS ones. It must be noted that this approach is disadvantageous to the AIREN-NWP, as the evaluated variable differs from the one used in training. This disadvantage may be seen in the following evaluations, where the improvement of APCP prediction stagnates against the GFS but increases towards the event against our GFS post-processing baseline.
We have created three models, one for each target variable, with performance summarized in the following plot. Precipitation (APCP) is predicted 3 hours into the future, and the model is updated every 30 minutes, achieving up to 1.4x lower MAE than that of the GFS. The temperature (TMP) and wind gusts (GUST) are predicted 6 hours into the future, with the model updated every 60 minutes. They are respectively achieving up to 2.1x and 2.5x improvement in MAE, and the improvement is increasing towards the event.
To explore the benefits of using real-time measurements and how AIREN-NWP models perform in their native 1-h output granularity, we compare them to our GFS post-processing baselines. These models are trained with the same training setup and using the same input samples, just a different combination of the real-time measurements (SYNOP, radar, and satellite) on the input.
The plots above show a clear correlation between the number of input data sources and the improvement in prediction performance. Moreover, the relative MAE improvements always increase towards the event, as is expected of AIREN-NWP. For the last prediction before the event, the models with all inputs available achieve respectively 4.7%, 30.8%, and 6.7% improvements over our GFS post-processing baseline.
AIREN-NWP has the potential to rapidly update its predictions due to the availability of real-time data sources like radar and satellite measurements. To explore this capability, we tested different update frequencies for precipitation (APCP) predictions, precisely 10, 30, and 60 minutes. The achieved errors are very close, within a 1% margin of the base 60-min updates, with the 10-min updates falling slightly behind. As a result, the 30-minute update interval frequency is optimal for the current generation of AIREN-NWP models.
We are considering two hypotheses as to why more frequent 10-minute updates are not beneficial. Since the prediction step is 1 hour, the new information available in every 10-minute measurement might be negligible for improving the lower-resolution 1-hour forecast. Alternatively, the current architecture might need some tweaks to utilize information updates for 1-hour predictions at such a rapid pace more effectively. Further investigation into this aspect will be done in future work to enable forecast updates with every new measurement available.
The choice of 3, 6, and 6-hour prediction horizons for APCP, TMP, and GUST, respectively, was the result of a combination of data analysis and practical considerations. We initially experimented with models using only SYNOP data as real-time input, a 60-minute update step, and a 14-hour prediction horizon. This exploration revealed that the improvement gains with each model update are more significant as nearing the predicted event for temperature. Based on this observation, a 6-hour horizon was chosen to focus on maximizing accuracy within this timeframe, which is also the most interesting from the product point of view. The GUST model exhibits a more linear behavior, but the same 6-hour horizon was adopted for consistency and ease of use.
For precipitation (APCP), however, the analysis indicated smaller improvement as the predicted event draws nearer, and storm prediction remains a well-known challenge. Furthermore, accurate precipitation measurements are among the less readily available data, with synoptic stations often only offering longer timespan aggregates. Consequently, a 3-hour horizon was selected, with a future focus on enhancing the resolution of precipitation forecasts rather than extending this prediction horizon.
Prediction Examples
This section showcases sample AIREN-NWP predictions for each of the three target variables from the forecasting point of view. The ERA5 reanalysis is used as ground truth in this section as well.
APCP
On 23rd December 2023, a waving cold front was over Central Europe, and precipitation was moving from the west at this boundary. Analyzing the 3-h aggregated AIREN-NWP forecasts, the peak precipitation amounts of the GFS model are correctly adjusted.
Moreover, AIREN-NWP increases output granularity to 1 hour, delivering a more detailed view of weather development in the studied situation.
TMP
While spring weather prevailed in Central Europe on 28th February 2024, a ridge of high pressure began to move into it from the west by a dissipating cold front over eastern Central Europe. The warmer air flowing into Hungary and Slovakia contrasted with colder air penetrating the regions of France and Germany.
The AIREN-NWP forecast is similar to the ERA5 reanalysis, with the only major mistake of predicting higher temperatures in southern Poland at 12 UTC. AIREN-NWP correctly lowered the temperatures of the input GFS forecast in the area over Germany and France and increased temperatures in Hungary, closing the gap to the ERA5.
GUST
A typical calm spring weather prevailing over Central Europe on 24th February is one of those situations when accurate forecasts are very important, for example, in the energy sector. The AIREN-NWP model better estimated the higher wind speed in France and correctly reduced the speed in the Baltic region and north-eastern Poland at 12 UTC. The input GFS model predicted almost no wind in Central Europe, while AIREN-NWP increased the speeds toward the actual moderate winds present in ERA5. While the situation at 15 UTC is very similar, it has to be noted that AIREN-NWP slightly overshoots the speeds in France. At 18 UTC, AIREN-NWP corrects the wind speeds that are too high in France and the Baltic region while missing the gusts in the Alpine region in the same way as GFS does.
Conclusion
This work successfully demonstrates the potential of fusing AI and NWP models to enhance weather forecasts for the immediate future by incorporating real-time data. While AIREN-NWP has been developed to predict precipitation, temperature, and wind gusts based on GFS forecasts, its core approach can be tailored to any combination of the NWP model and real-time weather measurements.
AIREN-NWP achieves:
- Improved Accuracy - It improves input GFS forecasts 1.4x, 2.1x and 2.5x, respectively, for APCP, TMP and GUST prediction.
- Fusion - AIREN-NWP successfully combines diverse data sources with varying spatial and temporal resolutions, leveraging their strengths to improve forecast accuracy.
- Temporal Resolution Increase - It offers 1-hour predictions, improving the 3-hour step of GFS.
- Rapid Updates - The model continuously improves its predictions with each recompute, incorporating newly acquired weather data to stay aligned with the latest observations, ensuring the forecasts remain accurate and relevant.
Looking ahead, the development roadmap of AIREN-NWP includes integration of orographic data, model updates with every new measurement, adaptation to a regional NWP input, and architecture optimization to utilize the input data to its maximum.