This is a Preprint and has not been peer reviewed. This is version 4 of this Preprint.

Machine Learning-based Hydrological Models for Flash Floods: A Systematic Literature Review
Downloads
Supplementary Files
Authors
Abstract
Flash floods are critical events for emergency management, yet their modeling remains highly challenging, even in smart cities approaches. Physically based hydrological models are often unsuitable at small spatiotemporal scales due to their computational complexity and dependence on detailed local parameters, which are rarely available during flash floods. With the growing availability of hydrological data, machine learning (ML) has emerged as a promising alternative. This work performs a Systematic Literature Review (SLR) to improve our understanding of the research landscape on ML applications for flash flood forecasting, a significant subset of flash flood modeling. From more than 1,200 papers published until January 2024 in Web of Science, SCOPUS/Elsevier, Springer/Nature, and Wiley, 50 were selected following PRISMA guidelines. The inclusion and exclusion criteria removed reviews, retractions, papers focused on post-flood damage assessment (not forecasting), and those with time resolutions of 6 hours or more, retaining only studies with fine-scale temporal data (<6 hours). For each paper, we extracted information on forecasting horizon, study area size, input data, ML techniques, and outcomes (regression or classification). Results show a sharp rise in ML-based flash flood research, with China leading (38%). Nearly all studies rely on rainfall, discharge, and water level data - often in combination. Long short-term memory (LSTM) networks dominate (60%). Unfortunately, only 10% of the selected studies provide access to their datasets. This lack of transparency poses a major barrier to reproducibility, inhibits fair comparative evaluation of models, and ultimately slows methodological progress in flash flood forecasting. Furthermore, our review highlights that no method consistently outperforms others. This variability in performance is likely influenced by factors such as regional hydrological characteristics (e.g., differences between arid and tropical basins), variations in input data quality, and the length of the forecast horizon (e.g., 1- vs. 6-hour prediction). Lastly, we recommend advancing this field through integration with early warning systems, creation of benchmarks, open data practices, and stronger multidisciplinary collaboration.
DOI
https://doi.org/10.31223/X5C699
Subjects
Hydrology
Keywords
Artificial Intelligence, machine learning, flash floods, floods
Dates
Published: 2024-07-01 02:10
Last Updated: 2025-09-28 14:53
Older Versions
License
CC BY Attribution 4.0 International
Additional Metadata
Conflict of interest statement:
None
There are no comments or no comments have been made public for this article.