Hidden Stories: Topic Modeling in Hydrology Literature

This is a Preprint and has not been peer reviewed. This is version 5 of this Preprint.

Add a Comment

You must log in to post a comment.


There are no comments or no comments have been made public for this article.


Download Preprint

Supplementary Files

Mashrekur Rahman , Jonathan Frame , Jimmy Lin, Grey Stephen Nearing


Hydrologic research generates large volumes of peer-reviewed literature across a number of evolving sub-topics. It’s becoming increasingly difficult for scientists and practitioners to synthesize this full body of literature. This study explores topic modeling as a form of unsupervised learning applied to 42,154 article-abstracts from six high-impact (Impact Factor > 0.9) journals (Water Resources Research (WRR), Hydrology and Earth System Sciences (HESS), Journal of Hydrology (JH), Hydrological Processes (HP), Hydrological Sciences Journal (HSJ), Journal of Hydrometeorology (JHM) to provide a high-level contextual analyses of hydrologic science literature since 1991. We used a hybrid objective-subjective approach to label a number of broad topics in this body of literature, and used these labeled topics to analyze topic trends, inter-topic relationships, and journal diversity. As an example of what we can learn from this type of analysis, results showed that data-driven research topics are gaining in popularity while some subsurface related topics appeared to lose popularity within our journal set and time period. While no journal in our sample was completely homogeneous, JHM and WRR exhibited the most notable preferences for certain topics over others. The methods and outcomes of this paper are potentially beneficial to scientists and researchers who aim to gain a contextual understanding of the existing state of hydrologic science literature. In the long term, we see topic modeling as a tool to help increase the efficiency of literature reviews, science communication, and science-informed policy and decision making.




Artificial Intelligence and Robotics, Computer Sciences, Earth Sciences, Hydrology, Library and Information Science, Physical Sciences and Mathematics, Social and Behavioral Sciences


machine learning, Unsupervised learning, hydrology, natural language processing, topic modeling, Hydrology Literature, Science Communication


Published: 2020-05-25 12:19

Last Updated: 2021-07-14 16:35

Older Versions

Academic Free License (AFL) 3.0