Data Cubes for Earth System Research: Challenges Ahead

This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.

Add a Comment

You must log in to post a comment.


There are no comments or no comments have been made public for this article.


Download Preprint


David Montero Loaiza , Guido Kraemer, Anca Anghelea, Cesar Luis Aybar Camacho, Gunnar Brandt, Gustau Camps-Valls, Felix Cremer, Ida Flik, Fabian Gans, Sarah Habershon, Chaonan Ji, Teja Kattenborn, Laura Martínez-Ferrer, Francesco Martinuzzi, Martin Reinhardt, Maximilian Söchting, Khalil Teber, Miguel Mahecha


Progress in Earth system science is accelerating rapidly, due to the increasing availability of multivariate datasets, often global, with moderate to high spatio-temporal resolutions. Turning these data into knowledge presents interoperability, technical, analytical, and other challenges. Earth System Data Cubes (ESDCs) have surfaced as essential tools, offering analysis-ready, cloud-optimised multivariate solutions. Coupled with advancements in Artificial Intelligence (AI), these solutions have the potential to release a wealth of information from the vast amounts of data that they contain. The application of AI methods to ESDCs promises to unpick the complexities of the Earth system, learning the underlying non-linearities to forecast its spatio-temporal behaviour. However, naive applications of such methods might lead to wrong conclusions and predictions. In this perspective paper, we discuss the methodological and conceptual challenges that AI applications of ESDCs bring. Particular risks are naive applications that ignore intrinsic properties of the Earth system, such as spatio-temporal auto-correlation issues that may deliver highly accurate but flawed predictions. Other applications may ignore known causal structures of Earth system dynamics. We also face technical challenges, such as adequate sampling strategies in ESDCs. Furthermore, documenting data cube provenance is essential to ensure end-to-end reproducible workflows. Effective visualisation tools are required to enable users to quickly navigate terabytes of data and develop an intuition for spatio-temporal dynamics encoded in these cubes. Given this, we aim to synthesise the main challenges and derive an agenda for advancing data science on data cubes to better understand global Earth system processes.



Computer Sciences, Earth Sciences, Environmental Sciences


Earth System Science, Artificial Intelligence, Data Cubes


Published: 2023-07-11 14:24

Last Updated: 2023-07-11 21:24


CC BY Attribution 4.0 International