Never train an LSTM on a single basin

This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.

Add a Comment

You must log in to post a comment.


There are no comments or no comments have been made public for this article.


Download Preprint


Frederik Kratzert, Martin Gauch, Daniel Klotz, Grey Nearing 


Machine learning (ML) has an increasing role in the hydrological sciences, and in particular, certain types of time series modeling strategies are popular for rainfall-runoff modeling. A large majority of studies that use this type of model do not follow best practices, and there is one mistake in particular that is very common: training deep learning models on small, homogeneous data sets (i.e., data from one or a small number of watersheds). In this position paper, we argue why it is not a good idea to train a Long Short Term Memory (LSTM) model on data from a single watershed. Instead, deep learning streamflow models are best when trained with a large amount of hydrologically diverse data.



Artificial Intelligence and Robotics, Computer Sciences, Earth Sciences, Hydrology, Physical Sciences and Mathematics


hydrology, LSTM, Deep learning, machine learning


Published: 2023-12-05 18:03

Last Updated: 2023-12-05 23:03


CC BY Attribution 4.0 International