This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.
From Empirical Curves to AI-Derived Rainfall Thresholds for Landslide Initiation in Peninsular Malaysia
Downloads
Authors
Abstract
Rainfall-induced landslides are a persistent hazard in Malaysia, yet existing rainfall thresholds remain largely based on empirical methods and often lack regional adaptability. This study employs machine learning (ML)
based rainfall thresholds for landslide initiation in Peninsular Malaysia. A dataset of rainfall events from 70 rainfall stations across peninsular Malaysia linked with documented 79 landslides was analysed, along with key predictors such as event cumulative rainfall (ECR), maximum and mean intensity, duration, and antecedent rainfall windows (3–20 days). Two state-of-the-art gradient boosting algorithms, CatBoost and XGBoost, were trained to classify rainfall events as landslide- or non-landslide-triggering. Performance of models was evaluated using a confusion matrix, precision, Accuracy, recall, F1-score, and ROC-AUC. Moreover, SHAP explainability analysis
was applied to assess the relative importance of rainfall metrics in threshold exceedance. CatBoost shows a superior practical reliability, with a higher accuracy of 0.83 and recall of 0.67 as compared to XGBoost, which
showed a higher ROC–AUC of 0.876 but substantially lower recall of 0.33. These findings demonstrate that ML
derived rainfall thresholds for peninsular Malaysia offer a more flexible and reliable basis for early warning
systems, supporting landslide risk management in Malaysia.
DOI
https://doi.org/10.31223/X58F4W
Subjects
Physical Sciences and Mathematics
Keywords
Artificial Intelligence, rainfall threshold, landslide prediction, Peninsular Malaysia,
Dates
Published: 2025-12-09 12:29
Last Updated: 2025-12-09 12:29
License
CC BY Attribution 4.0 International
Additional Metadata
Conflict of interest statement:
None
Data Availability (Reason not available):
Data cannot be publicly shared because some rainfall and landslide datasets are restricted under institutional and governmental policies.
There are no comments or no comments have been made public for this article.