AEF-Econ: Toward Plug-and-Play Socioeconomic Foundation Embeddings from AlphaEarth for Urban Remote Sensing

Shuyang Hou; Ziqi Liu; Haoyue Jiao; Lutong Xie; Yaxian Qing; Qingyang Xu; Zhangyan Xu; Xuefeng Guan; Huayi Wu

AEF-Econ: Toward Plug-and-Play Socioeconomic Foundation Embeddings from AlphaEarth for Urban Remote Sensing

This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.

Add a Comment

You must log in to post a comment.

Comments

There are no comments or no comments have been made public for this article.

Downloads

Download Preprint

Authors

Shuyang Hou, Ziqi Liu, Haoyue Jiao, Lutong Xie, Yaxian Qing, Qingyang Xu, Zhangyan Xu, Xuefeng Guan, Huayi Wu

Abstract

AlphaEarth Foundations (AEF) unify global remote sensing foundation embeddings through multimodal self-supervised learning, but their pretraining focuses on physical land-surface signals, limiting plug-and-play use in socioeconomic tasks. We integrate seven heterogeneous data streams across 36 Chinese cities over eight years—AEF embeddings, population, nighttime lights, remote sensing indices, points of interest (POIs), urban morphology, and cross-lingual text—and construct CHN-Econ, a socioeconomic benchmark with 16 labels in three categories. We conduct 31 controlled experiments along five axes: fusion architecture, self-supervised objective, text integration, embedding dimensionality, and normalization. Used alone as a linear probe, AEF achieves R² values of only 0.301 for cross-region and 0.160 for cross-tier evaluation. The five-axis ablated backbone improves these scores to 0.832 and 0.671, respectively, but reveals that low-dimensional semantic streams are consistently suppressed by high-dimensional streams under shared reconstruction. To address this bottleneck, we propose Capacity-Adaptive Reconstruction (CAR), replacing shared reconstruction with per-stream decoders and stream-level losses to mitigate inter-stream capacity competition. CAR further raises cross-region and cross-tier R² to 0.848 and 0.693, and restores collapsed labels from negative R² to a stable range. Using CAR, we infer 14.4 million pixels across 36 cities and eight years and release AEF-Econ, including 128d and 64d compressed versions. Self-diagnostics and case studies show that AEF-Econ captures cross-city hierarchies and intra-urban spatial organization under unsupervised settings, providing a socioeconomic remote sensing foundation embedding complementary to AEF physical embeddings.

DOI

https://doi.org/10.31223/X51F6G

Subjects

Computational Engineering, Computer Sciences, Earth Sciences, Electrical and Computer Engineering, Engineering, Environmental Sciences, Geography, Human Geography, Nature and Society Relations, Remote Sensing, Social and Behavioral Sciences

Keywords

Remote Sensing Foundation Embedding, Multi-Source Self-Supervised Learning, Capacity-Adaptive Reconstruction, Cross-Domain Generalization, Socioeconomic Remote Sensing, AlphaEarth Foundations

Dates

Published: 2026-06-19 15:03

Last Updated: 2026-06-19 15:03

License

CC BY Attribution 4.0 International

Metrics

Views: 637

Downloads: 47