A Performance Comparison of Unsupervised Machine Learning Algorithms for Clustering Water Depth Datasets at Urban Drainage Systems

Jiada Li; Simon Brewer

A Performance Comparison of Unsupervised Machine Learning Algorithms for Clustering Water Depth Datasets at Urban Drainage Systems

This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.

Add a Comment

You must log in to post a comment.

Comments

There are no comments or no comments have been made public for this article.

Downloads

Download Preprint

Authors

Jiada Li , Simon Brewer

Abstract

As sensor measurements emerge in urban water systems, data-driven unsupervised machine learning algorithms have been drawn tremendous interest in infrastructure monitoring, flow prediction, and pollutant warning recently. However, most of them are applied in water distribution systems, and few studies consider using unsupervised clustering analysis to group the time-series hydraulic-hydrologic data at urban drainage systems. To improve the understanding of how clustering analysis contributes to detecting urban flooding events, this study compared the performance of K-means Clustering, Agglomerative Clustering, and Spectral Clustering in uncovering time-series water depth similarity and finally identified the number of clusters with maximum performance scores. In this work, the water depth datasets are simulated by a real-world SWMM model and then formatted for a clustering problem. Three standard performance evaluation scores, the SCI, CHI, and DBI, are employed to assess the clustering performance under six artificial rainfalls and two recorded storms. The results indicate that SCI and DBI are appropriate for assessing the performance of K-means Clustering and Agglomerative Clustering, while CHI only works for Spectral Clustering. Noticeably, it was found that the number of clusters is negatively related to the dataset length, but less correlated with the dataset magnitude.

DOI

https://doi.org/10.31223/osf.io/ycw3v

Subjects

Civil and Environmental Engineering, Civil Engineering, Engineering, Environmental Engineering

Keywords

Clustering analysis, Cluster number, Data features, SWMM modeling, Unsupervised Machine Learning

Dates

Published: 2020-04-30 08:07

License

GNU Lesser General Public License (LGPL) 2.1

Metrics

Views: 1099

Downloads: 686