This is a Preprint and has not been peer reviewed. The published version of this Preprint is available: https://doi.org/10.1080/02626667.2025.2513482. This is version 1 of this Preprint.

Large-sample characterization of flooding events in India
Downloads
Authors
Abstract
Effective flood management requires a robust understanding of past floods. In India, such understanding is largely limited to case studies due to the absence of a standardized observed flood dataset. We address this gap by presenting a national dataset of 7500 flooding events, developed by merging observed streamflow records with official flooding thresholds and augmenting it with multiple catchment-scale variables. Spatial analysis reveals high normalized flood magnitudes along the southwest coast—an area with intense rainfall and mountainous terrain. Temporally, 86% of floods occur during the southwest monsoon. Using a random forest model combined with the game-theoretic SHAP approach, we find that precipitation of the wettest month is the most influential predictor of flood magnitude. Grouped feature importance shows climatology contributes 61% to model performance, while geomorphology accounts for 39%. This comprehensive large-sample study surpasses conventional case studies, providing a more robust understanding of flooding patterns and drivers across India.
DOI
https://doi.org/10.31223/X58Q7S
Subjects
Engineering, Physical Sciences and Mathematics
Keywords
floods, India, Random Forest, data-driven hydrology, machine learning, streamflow
Dates
Published: 2025-05-31 19:12
Last Updated: 2025-05-31 19:12
There are no comments or no comments have been made public for this article.