A 1.2 Billion Pixel Human-Labeled Dataset for Data-Driven Classification of Coastal Environments

This is a Preprint and has not been peer reviewed. The published version of this Preprint is available: https://doi.org/10.1038/s41597-023-01929-2. This is version 1 of this Preprint.

Add a Comment

You must log in to post a comment.


Comments

There are no comments or no comments have been made public for this article.

Downloads

Download Preprint

Authors

Daniel David Buscombe , Phillipe Wernette , Sharon Nicole Fitzpatrick , Jaycee Favela , Evan B Goldstein , Nicholas Enwright

Abstract

The world’s coastlines are spatially highly variable, coupled-human-natural systems that comprise a nested hierarchy of component landforms, ecosystems, and human interventions, each interacting over a range of space and time scales. Understanding and predicting coastline dynamics necessitates frequent observation from imaging sensors on remote sensing platforms. Machine Learning models that carry out supervised (i.e., human-guided) pixel-based classification, or image segmentation, have transformative applications in spatio-temporal mapping of dynamic environments, including transient coastal landforms, sediments, habitats, waterbodies, and water flows. However, these models require large and well-documented training and testing datasets consisting of labeled imagery. We describe “Coast Train,” a multi-labeler dataset of orthomosaic and satellite images of coastal environments and corresponding labels. These data include imagery that are diverse in space and time, and contain 1.2 billion labeled pixels, representing over 3.6 million hectares. We use a human-in-the-loop tool especially designed for rapid and reproducible Earth surface image segmentation. Our approach permits image annotation by multiple labelers, in turn enabling quantification of pixel-level agreement over individual and collections of images.

DOI

https://doi.org/10.31223/X5Z06C

Subjects

Physical Sciences and Mathematics

Keywords

labeled data, human-in-the-loop machine learning, Image Segmentation, coastal landforms, shoreline environments

Dates

Published: 2022-09-01 14:40

License

CC BY Attribution 4.0 International

Additional Metadata

Conflict of interest statement:
None

Data Availability (Reason not available):
https://doi.org/10.5066/P91NP87I