This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.
Downloads
Authors
Abstract
As sensor measurements emerge in urban water systems, data-driven unsupervised machine learning algorithms have been drawn tremendous interest in infrastructure monitoring, flow prediction, and pollutant warning recently. However, most of them are applied in water distribution systems, and few studies consider using unsupervised clustering analysis to group the time-series hydraulic-hydrologic data at urban drainage systems. To improve the understanding of how clustering analysis contributes to detecting urban flooding events, this study compared the performance of K-means Clustering, Agglomerative Clustering, and Spectral Clustering in uncovering time-series water depth similarity and finally identified the number of clusters with maximum performance scores. In this work, the water depth datasets are simulated by a real-world SWMM model and then formatted for a clustering problem. Three standard performance evaluation scores, the SCI, CHI, and DBI, are employed to assess the clustering performance under six artificial rainfalls and two recorded storms. The results indicate that SCI and DBI are appropriate for assessing the performance of K-means Clustering and Agglomerative Clustering, while CHI only works for Spectral Clustering. Noticeably, it was found that the number of clusters is negatively related to the dataset length, but less correlated with the dataset magnitude.
DOI
https://doi.org/10.31223/osf.io/ycw3v
Subjects
Civil and Environmental Engineering, Civil Engineering, Engineering, Environmental Engineering
Keywords
Clustering analysis, Cluster number, Data features, SWMM modeling, Unsupervised Machine Learning
Dates
Published: 2020-04-30 09:07
There are no comments or no comments have been made public for this article.