This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.
HydroScholar AI: A Collaborative Agent for End-to-End Automated Hydrological Research Lifecycle
Downloads
Authors
Abstract
Hydrological research relies on multi-stage computational workflows that are often slow, fragmented across disparate tools, and inconsistently documented, limiting reproducibility. This study presents HydroScholar AI, an agentic, human-in-the-loop platform that consolidates the plan-to-paper research lifecycle into a single interactive automated framework. From a natural-language prompt, the system proposes a stepwise research plan for researcher approval, translates it into executable Python files within an integrated editor, provides debugging and re-execution support, generates visualizations, and drafts a manuscript. The workflow includes an automated provenance framework that generates and records the entire human-AI decision path, including prompts, approvals, iterative code edits, model identifiers, execution events, and file diffs, to support transparency and auditability. The system is demonstrated through an author-conducted case study: a five-year (2019-2023) daily streamflow analysis for USGS station 05454500 (Iowa River at Iowa City), computing annual mean flow, 7-day low flow, and peak-flow dates and producing a baseline manuscript of the study. The case study shows that consolidating planning, coding, execution, and drafting in one workspace enables progression from an initial prompt to a runnable analysis and baseline manuscript within a single auditable session, while the provenance framework renders the human-AI decision path fully traceable. Expert review remained essential for methodological choices, such as validation beyond missing-value checks, and for hydrologic interpretation of results. HydroScholar AI illustrates how agentic large language models can handle routine analytical tasks without displacing expert judgment, and how capturing the provenance of human-AI collaboration can strengthen reproducibility in computational hydrology.
DOI
https://doi.org/10.31223/X5SV05
Subjects
Computer Sciences, Earth Sciences, Environmental Sciences, Physical Sciences and Mathematics
Keywords
Agentic AI, Large language models, Hydrology workflows, Provenance, Human-in-the-loop, Integrated research environment, Computational reproducibility
Dates
Published: 2026-04-10 00:20
Last Updated: 2026-04-10 00:20
License
CC BY Attribution 4.0 International
Additional Metadata
Conflict of interest statement:
None
Data Availability:
The provenance log, experiment script, experiment plan, and requirements file from the Iowa River case study are provided as supplementary material appended to the manuscript to substantiate the operational metrics reported in Section 4.5 and to support auditability of the demonstrated workflow and downstream reproducibility of the analysis outputs.
Metrics
Views: 39
Downloads: 4
There are no comments or no comments have been made public for this article.