Hidden Stories: Topic Modeling in Hydrology Literature

This is a Preprint and has not been peer reviewed. This is version 5 of this Preprint.

Add a Comment

You must log in to post a comment.


Comments

There are no comments or no comments have been made public for this article.

Downloads

Download Preprint

Supplementary Files
Authors

Mashrekur Rahman , Jonathan Frame , Jimmy Lin, Grey Stephen Nearing

Abstract

Hydrologic research generates large volumes of peer-reviewed literature across a number of evolving sub-topics. It’s becoming increasingly difficult for scientists and practitioners to synthesize this full body of literature. This study explores topic modeling as a form of unsupervised learning applied to 42,154 article-abstracts from six high-impact (Impact Factor > 0.9) journals (Water Resources Research (WRR), Hydrology and Earth System Sciences (HESS), Journal of Hydrology (JH), Hydrological Processes (HP), Hydrological Sciences Journal (HSJ), Journal of Hydrometeorology (JHM) to provide a high-level contextual analyses of hydrologic science literature since 1991. We used a hybrid objective-subjective approach to label a number of broad topics in this body of literature, and used these labeled topics to analyze topic trends, inter-topic relationships, and journal diversity. As an example of what we can learn from this type of analysis, results showed that data-driven research topics are gaining in popularity while some subsurface related topics appeared to lose popularity within our journal set and time period. While no journal in our sample was completely homogeneous, JHM and WRR exhibited the most notable preferences for certain topics over others. The methods and outcomes of this paper are potentially beneficial to scientists and researchers who aim to gain a contextual understanding of the existing state of hydrologic science literature. In the long term, we see topic modeling as a tool to help increase the efficiency of literature reviews, science communication, and science-informed policy and decision making.

DOI

https://doi.org/10.31223/osf.io/2sy7a

Subjects

Artificial Intelligence and Robotics, Computer Sciences, Earth Sciences, Hydrology, Library and Information Science, Physical Sciences and Mathematics, Social and Behavioral Sciences

Keywords

machine learning, Unsupervised learning, hydrology, natural language processing, topic modeling, Hydrology Literature, Science Communication

Dates

Published: 2020-05-25 03:19

Last Updated: 2021-07-14 07:35

Older Versions
License

Academic Free License (AFL) 3.0