Google Uses Language Models to Predict Flash Floods
Flash floods are one of the deadliest weather events worldwide, claiming over 5,000 lives annually. Despite their destructive nature, they are challenging to predict accurately. Google, however, believes it has found a solution by harnessing the power of news articles.
Traditional weather data collection methods fall short when it comes to flash floods due to their transient and localized nature. This data gap poses a significant challenge for deep learning models, hindering their ability to forecast flash floods effectively.
In a groundbreaking initiative, Google researchers leveraged Gemini, the company’s large language model, to analyze 5 million news articles globally. By isolating reports of 2.6 million floods and creating a geo-tagged time series named “Groundsource,” Google aimed to fill the data void. This innovative approach marks Google’s first foray into using language models for weather prediction, as highlighted by Gila Loike, a Google Research product manager. The research findings and dataset were publicly shared on Thursday.
Building on the Groundsource dataset, the researchers developed a model based on a Long Short-Term Memory (LSTM) neural network. This model processes global weather forecasts to generate flash flood probabilities in specific regions.
Google’s flash flood forecasting model now identifies risks in urban areas across 150 countries on the Flood Hub platform. The data is also shared with emergency response agencies worldwide, aiding in quicker flood response efforts. António José Beleza, an emergency response official at the Southern African Development Community, commended the model for enhancing their flood response capabilities.
Despite its effectiveness, the model has limitations, such as low resolution and the absence of real-time radar data integration, unlike the US National Weather Service’s flood alert system. However, Google’s model caters to regions lacking advanced weather infrastructure or extensive meteorological records.
Juliet Rothenberg, a program manager on Google’s Resilience team, emphasized the significance of the Groundsource dataset in expanding forecasting capabilities to underrepresented regions. The team envisions applying language models to develop datasets for forecasting phenomena like heatwaves and mudslides.
Marshall Moutenot, CEO of Upstream Tech, applauded Google’s contribution to data aggregation for deep learning-based weather forecasting. Moutenot, who co-founded dynamical.org, a platform curating machine learning-ready weather data, highlighted the challenges of data scarcity in geophysics and praised Google’s innovative approach to addressing this issue.

