This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.
Can AlphaEarth Foundations Redefine the Paradigm of Gridded Population Mapping? A Systematic Evaluation across 18 Global Cities and Large-Scale Mapping Applications
Downloads
Authors
Abstract
Population mapping typically relies on census data and its update cycles, and is further constrained by manual feature engineering and limited cross-regional generalization. AlphaEarth Foundations (AEF) provides globally consistent, analysis-ready 64-dimensional annual surface embeddings, offering a new data foundation for reducing dependence on frequent census updates and enabling more transferable, data-driven mapping approaches. However, its suitability for modeling population distribution—a socio-economic variable that is not directly observable—remains insufficiently evaluated. This study investigates 18 cities distributed across six continents, integrating AEF data from 2017–2024 with WorldPop 100 m population grids from 2017–2020. We evaluate the performance of linear models, random forests, and deep learning models under four training strategies, four spatial scales, and four categories of auxiliary variables. Results show that AEF embeddings achieve strong explanatory power under random-split validation, with median R² values ranging from 0.82 to 0.96. However, spatial block cross-validation reveals a pronounced dependence on spatial autocorrelation, with R² decreasing by 0.46–0.67. Compared to cross-city training, single-city multi-year training yields more stable spatial generalization, while cross-city transfer is constrained by distributional shifts in AEF features. Among auxiliary variables, spatial structure factors provide the most substantial improvements in spatial generalization, whereas high-dimensional points-of-interest (POI) features may introduce redundancy and degrade performance in deep learning models. PCA and SHAP analyses further indicate that AEF dimensions exhibit a backbone–long-tail contribution pattern, where a small number of dominant dimensions coexist with many weak contributors; more complete utilization of the full 64-dimensional embedding is associated with improved spatial generalization. Based on the optimal combination of strategies, we generate annual population maps for 2021–2024 at 1 km resolution globally and 100 m resolution for the 18 cities. These results demonstrate that AEF-driven approaches can effectively complement temporal gaps in WorldPop and support the updating of high-resolution population distribution data.
DOI
https://doi.org/10.31223/X52B40
Subjects
Computational Engineering
Keywords
Population mapping, AlphaEarth Foundations, WorldPop, global mapping, multi-source data fusion modeling
Dates
Published: 2026-05-14 13:53
Last Updated: 2026-05-14 13:53
License
CC BY Attribution 4.0 International
Additional Metadata
Conflict of interest statement:
None
Metrics
Views: 76
Downloads: 2
There are no comments or no comments have been made public for this article.