Predicting Harmful Algal Blooms Using Ensemble Machine Learning Models and Explainable AI Technique: A Comparative Study

This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.

Add a Comment

You must log in to post a comment.


Comments

There are no comments or no comments have been made public for this article.

Downloads

Download Preprint

Authors

Omer Mermer, Eddie Zhang, Ibrahim Demir

Abstract

Harmful Algal Blooms (HABs), driven by environmental pollution, pose significant threats to water quality, public health, and aquatic ecosystems. This study aims to enhance the prediction of HABs in Lake Erie, part of the Great Lakes system, by utilizing ensemble machine learning (ML) models coupled with explainable artificial intelligence (XAI) for interpretability. Using water quality data from 2013 to 2020, various physical, chemical, and biological parameters were analyzed to predict chlorophyll-a (Chl-a) concentrations, a proxy for algal blooms. The study employed multiple ensemble ML models, including Random Forest (RF), Deep Forest (DF), Gradient Boosting (GB), and XGBoost, and compared their performance against individual models such as Support Vector Machine (SVM), Decision Tree (DT), and Multi-Layer Perceptron (MLP). The findings reveal that ensemble models, particularly XGBoost and Deep Forest (DF), achieve superior predictive accuracy with R² values of 0.8517 and 0.8544, respectively. The application of SHapley Additive exPlanations (SHAP) provided insights into the relative importance of input features, identifying Particulate Organic Nitrogen (PON), Particulate Organic Carbon (POC), and Total Phosphorus (TP) as critical factors influencing Chl-a concentrations. This research demonstrates the effectiveness of integrating ensemble ML models with XAI to improve HAB prediction accuracy and interpretability. The results support the development of proactive water quality management strategies and highlight the potential of advanced ML techniques in environmental monitoring.

DOI

https://doi.org/10.31223/X5370R

Subjects

Engineering, Environmental Engineering

Keywords

Ensemble Machine Learning, Algal bloom, Chlorophyll-a, Explainable AI, water quality.

Dates

Published: 2024-11-01 02:57

Last Updated: 2024-11-01 09:57

License

CC BY Attribution 4.0 International