Pangeo Forge: Crowdsourcing Analysis-Ready, Cloud Optimized Data Production

This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.

Add a Comment

You must log in to post a comment.


Comments

There are no comments or no comments have been made public for this article.

Downloads

Download Preprint

Authors

Charles Stern, Ryan Abernathey , Joseph J Hamman , Rachel Wegener , Chiara Lepore , Sean Harkins

Abstract

Pangeo Forge is a new community-driven platform that accelerates science by providing high-level recipe frameworks alongside cloud compute infrastructure for extracting data from provider archives, transforming it into analysis-ready, cloud-optimized (ARCO) data stores, and providing a human- and machine-readable catalog for browsing and loading. In abstracting the scientific domain logic of data recipes from cloud infrastructure concerns, Pangeo Forge aims to open a door for a broader community of scientists to participate in ARCO data production. A wholly open-source platform composed of multiple modular components, Pangeo Forge presents a foundation for the practice of reproducible, cloud-native, big-data ocean, weather, and climate science without relying on proprietary or cloud-vendor-specific tooling.

DOI

https://doi.org/10.31223/X5462G

Subjects

Earth Sciences, Environmental Sciences, Oceanography and Atmospheric Sciences and Meteorology

Keywords

Dates

Published: 2021-10-01 05:52

License

CC BY Attribution 4.0 International

Additional Metadata

Data Availability (Reason not available):
There is no data associated with the mansuscript.