Implementing machine learning to establish a relationship between coal ash spread and lined vs. unlined sites using publicly available data

This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.

Add a Comment

You must log in to post a comment.


There are no comments or no comments have been made public for this article.


Download Preprint


May Ming, Sutharsika Kumar Kalaiselvi


The fuel combustion process within coal power plants causes a significant amount of waste, called coal ash, often stored in slush basins. Due to low maintenance and lack of proper regulations, coal ash ponds have a high tendency to contaminate nearby groundwater sources. Without a simple way to ascertain whether the drinking water and soil near a private residential area is contaminated, citizens are unaware of the environmental risk surrounding them. To resolve this issue, this study aims to establish a correlation between the heavy metal concentrations in soil that are publicly available and the locations of coal ash plants. Thereafter, a user-friendly map will be created to determine the risk levels of their locations. To establish the correlation, four regression models and two classification models were implemented. Out of these models, the Support Vector Machine (SVM) proved to be the most accurate model in risk prediction, and the Mean Squared Error (MSE) reached the value of 0.01 in some cases. By running the models to compare the risks between the lined and unlined coal ash ponds, it was evident that the contamination levels surrounding unlined ponds were significantly greater than those near lined ponds. The results of this study will make a direct and positive impact on the community.



Physical Sciences and Mathematics


Environmental sustainability, machine learning, environmental modeling, sustainability


Published: 2024-06-12 07:31

Last Updated: 2024-06-12 14:31


No Creative Commons license