Skip to main content
EGMSpy: an open-source Python toolkit for scalable data handling, classification, clustering, and visualisation of Copernicus EGMS InSAR data

EGMSpy: an open-source Python toolkit for scalable data handling, classification, clustering, and visualisation of Copernicus EGMS InSAR data

This is a Preprint and has not been peer reviewed. The published version of this Preprint is available: https://doi.org/10.31223/X55B59. This is version 2 of this Preprint.

Add a Comment

You must log in to post a comment.


Comments

Comment #290 Filippo Catani @ 2026-05-09 11:38

Notice on code availability: as of today (May 9, 2026), the code is available under a MIT license at the address reported in the abstract. Please disregard the link provided in the section "Data and code availability" as the entire toolset is being transferred to the repository: https://github.com/fcatani/EGMSpy

Incorrect information in the paper is being corrected now and will be part of the next paper version, under preparation.

Sorry about that

Downloads

Download Preprint

Authors

Filippo Catani , Mario Floris, Lorenzo Nava , Caterina Palmieri, Rebecca Todde

Abstract

The Copernicus European Ground Motion Service (EGMS) provides millimetre-accuracy line-of-sight displacement measurements for over five billion coherent scatterers across Europe, derived from Sentinel-1 SAR interferometry over the period 2015--2023. Despite the unprecedented spatial coverage and measurement density of this dataset, no open-source integrated toolchain exists for processing, classifying, clustering, and interactively visualising EGMS data at the regional to continental scale without resorting to subsampling or proprietary software. In this paper, we introduce EGMSpy, a fully open-source Python pipeline that ingests raw EGMS L2b CSV tiles, compresses them into a federated split-GeoParquet database (metadata and time series stored separately, linked by a unique point identifier), applies a physics-informed hybrid rule/GMM time series classifier to assign each point to a geophysical deformation class, groups spatially and dynamically coherent deforming areas using a velocity-weighted three-dimensional DBSCAN algorithm, and serves the entire dataset to an interactive Leaflet.js web viewer via a Flask/DuckDB REST bridge capable of sub-100\,ms viewport queries over hundreds of millions of points. The pipeline is demonstrated on the northern Italy EGMS L2b dataset (2019--2023), comprising 433 tile pairs (approx. 300 million coherent scatterers, 104GB on disk), processed on a single workstation. Internal quality control confirms that the toolkit's OLS-derived velocity features are consistent with the official EGMS pre-computed velocities (RMSE=0.096mm\yr^-1, bias=-0.001mmyr^-1, r$=0.9993, n$=99,751). All code is released under the MIT licence and is available at https://github.com/fcatani/EGMSpy (repository in beta v1.0).

DOI

https://doi.org/10.31223/X55B59

Subjects

Civil and Environmental Engineering, Earth Sciences, Environmental Monitoring, Geomorphology

Keywords

InSAR monitoring, EGMS, Big data analysis, Ground displacements, Natural hazards

Dates

Published: 2026-05-08 08:50

Last Updated: 2026-05-11 20:16

Older Versions

License

CC-BY Attribution-NonCommercial-ShareAlike 4.0 International

Additional Metadata

Conflict of interest statement:
No conflict of interest

Data Availability:
Raw data are available at EGMS portal under the restrictions and licences of the Environmental European Agency and connected licensing agencies. Full code is available under MIT licence at: https://github.com/fcatani/EGMSpy.

Metrics

Views: 167

Downloads: 5