This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.
Comparing Process-Based and Machine Learning Models for Streamflow Prediction in the Kaligandaki River Basin, Nepal
Downloads
Authors
Abstract
Reliable daily streamflow prediction is critical for hydropower operations, flood risk management, and irrigation planning in monsoon-dominated Himalayan river basins. While both process-based and machine learning (ML) approaches have been used for such tasks, systematic comparisons that decompose the sources of performance differences remain scarce. This study evaluates seven configurations: a process-based SWAT model, and three XGBoost (XGB) and three Random Forest (RF) models representing pure rainfall-runoff, observed-lag, and recursive-simulation scenarios for streamflow prediction in the Kaligandaki river basin, Nepal. An NSE gap decomposition framework is applied to quantify two distinct components of discharge lag information value: the watershed memory benefit and the recursive error propagation cost. During the five-year independent test period, SWAT achieved NSE = 0.851, while pure rainfall runoff XGB-A and RF-A models achieved NSE = 0.840. Observed-lag upper-bound models reached NSE values above 0.94. RF recursive simulation (RF-C) achieved NSE = 0.861, exceeding SWAT, whereas XGB recursive simulation showed no improvement (XGB-C = 0.838), revealing a strong algorithm-dependent sensitivity to recursive error propagation. Flow duration curve analysis reveals that SWAT underestimates low flows (>60% exceedance probability) despite near-zero total PBIAS (−0.28%), reflecting compensating biases between high- and low-flow regimes. SHAP analysis identifies the Antecedent Precipitation Index (API) as the dominant precipitation predictor, with 30-day cumulative rainfall as the second-ranked feature, confirming multi-week soil-moisture memory as the primary catchment-scale control on discharge in this large, monsoon-dominated basin. These results establish that well-engineered pure rainfall-runoff models match SWAT in aggregate NSE while substantially outperforming it in distributional flow reproduction.
DOI
https://doi.org/10.31223/X55N3N
Subjects
Civil and Environmental Engineering, Earth Sciences, Engineering, Environmental Sciences, Hydrology, Physical Sciences and Mathematics, Water Resource Management
Keywords
Streamflow prediction, SWAT model, Machine learning hydrology, XGBoost, Random Forest, Himalayan river basin, Kaligandaki River, Hydrological modeling, NSE decomposition, Recursive forecasting
Dates
Published: 2026-06-04 05:57
Last Updated: 2026-06-04 05:57
License
CC BY Attribution 4.0 International
Additional Metadata
Conflict of interest statement:
None
Data Availability:
The data used in this study are available from publicly accessible sources and referenced datasets. Processed datasets and model configurations can be shared upon reasonable request.
Metrics
Views: 22
Downloads: 0
There are no comments or no comments have been made public for this article.