Skip to main content
A benchmark deep learning dataset for the classification of supraglacial lake drainage mechanism across the central-west Greenland Ice Sheet

A benchmark deep learning dataset for the classification of supraglacial lake drainage mechanism across the central-west Greenland Ice Sheet

This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.

Add a Comment

You must log in to post a comment.


Comments

There are no comments or no comments have been made public for this article.

Downloads

Download Preprint

Authors

Joshua Harlan Rines, Ching-Yao Lai, Ellianna Abrahams, Michael G. Shahin, Niall Coffey, Eojin Lee, Laura A. Stevens

Abstract

Supraglacial lakes on the Greenland Ice Sheet drain through physically distinct pathways: hydrofracture, moulins, lateral stream routing, and crevasse-fields. Each drainage mechanism carries unique implications for ice sheet dynamics. Existing automated classifications reduce each lake's drainage behavior to a time-series of scalar values representing the observed water surface-area and classify each lake based on drainage rate (e.g., rapid vs. slow). This scalar reduction conflates physically different drainage mechanisms, which can only be determined through consideration of full spatio-temporal tracking. Here we introduce a human-benchmarked, machine learning-ready benchmark dataset that pairs full Sentinel-2 multispectral satellite imagery time series with human-expert-labels assigned for N=1679 supraglacial lakes in the central-west basin of the Greenland Ice Sheet during the 2018 (n=679) and 2019 (n=1000) melt seasons. The dataset is formatted as per-lake CF-1.8 NetCDF files each containing: six Sentinel-2 reflectance bands at 10 meter spatial resolution and daily cadence over the 153 day melt season (1 May to 30 September); a per-pixel binary cloud mask; co-registered lake water masks (both static and dynamic); and the human-assigned drainage classification labels. We accompany the dataset with a baseline deep learning classifier, demonstrating the utility of the dataset both in deep learning workflows and in extending lake drainage classification from rate-based to mechanism-based. The dataset is released through the Stanford Digital Repository under a CC BY 4.0 license, and the accompanying open-source sat-tile-stack preprocessing software under an MIT license.

DOI

https://doi.org/10.31223/X58N26

Subjects

Artificial Intelligence and Robotics, Earth Sciences, Glaciology, Numerical Analysis and Scientific Computing, Physical Sciences and Mathematics

Keywords

Greenland, Supraglacial lakes, Machine learning

Dates

Published: 2026-05-29 15:03

Last Updated: 2026-05-29 15:03

License

CC BY Attribution 4.0 International

Additional Metadata

Conflict of interest statement:
The authors declare no conflict of interest

Data Availability:
https://doi.org/10.25740/sf350xp4038

Metrics

Views: 17

Downloads: 0