A Comprehensive Evaluation of Multimodal Large Language Models in Hydrological Applications

Likith Kadiyala; Omer Mermer; Dinesh Jackson Samuel; Yusuf Sermet; Ibrahim Demir

A Comprehensive Evaluation of Multimodal Large Language Models in Hydrological Applications

This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.

Add a Comment

You must log in to post a comment.

Comments

There are no comments or no comments have been made public for this article.

Downloads

Download Preprint

Authors

Likith Kadiyala, Omer Mermer, Dinesh Jackson Samuel, Yusuf Sermet, Ibrahim Demir

Abstract

Large Language Models (LLMs) combined with visual foundation models have demonstrated remarkable advancements, achieving a level of intelligence comparable to human capabilities. In this study, we conduct an analysis of the latest Multimodal LLMs (MLLMs), specifically Multimodal-GPT, GPT-4 Vision, Gemini and LLaVa, focusing on their application in the hydrology domain. The hydrology domain holds significant relevance for AI intelligence applications, including flood management and response, water level monitoring, agricultural water discharge, and water pollution management. Our analysis involves testing these MLLMs on various hydrology-specific studies, evaluating their response generation, and assessing their suitability for real-time systems. We deliberately selected complex real-world scenarios to explore the potential of MLLMs in addressing hydrological challenges. Additionally, we carefully designed prompts to enhance the models' visual inference capabilities and their ability to comprehend context from image data. The findings from our analysis reveal effective human-computer interaction and inspire potential solutions for real-world hydrological inference systems that incorporate both textual and image data. Among the validated models, GPT-4 Vision stands out as the top performer among other MLLMs, showcasing unparalleled proficiency in inferring visual data. The results highlight the significant understanding, reasoning, and decision-making capabilities that multimodal foundation models bring to the domain of hydrology. This research contributes valuable insights into the potential applications of advanced AI models in addressing complex challenges within hydrological contexts.

DOI

https://doi.org/10.31223/X5TQ37

Subjects

Artificial Intelligence and Robotics, Environmental Monitoring, Hydrology

Keywords

Large Language Models (LLMs), hydrology, Intelligent Assistants, Multimodal LLMs

Dates

Published: 2024-05-25 02:12

Last Updated: 2024-05-25 09:12

License

CC BY Attribution 4.0 International

Metrics

Views: 2308

Downloads: 1301