Hybrid Machine Learning for Integrating Pedological Knowledge into Digital Soil Mapping to Advance Next-Generation Earth System Models

This is a Preprint and has not been peer reviewed. This is version 3 of this Preprint.

Add a Comment

You must log in to post a comment.


Comments

There are no comments or no comments have been made public for this article.

Downloads

Download Preprint

Authors

Rodrigo Miranda, Rodolfo L. B. Nobrega , Estevão Silva, Jadson Silva, José Araújo Filho, Magna Moura, Alexandre Barros, Alzira Souza, Anne Verhoef, Wanhong Yang, Hui Shao, Raghavan Srinivasan, Feras Ziadat, Suzana Montenegro, Maria Araújo, Josiclêda Galvíncio

Abstract

Land surface and Earth System models require reliable soil maps to represent the influence of spatial variability of soil properties on ecosystem fluxes and storages. However, mapping soils using conventional in situ survey protocols is time-consuming and costly. We addressed the outdated spatial information on soil physico-chemical properties for a tropical region with a ~700-km longitudinal gradient of contrasting topography, climate, and vegetation (~98,000 km2; NE Brazil), by developing a novel hybrid machine learning framework and applying it to this region. This framework reduces prediction redundancies due to high multicollinearity by implementing a recursive feature selector algorithm for input selection; its core is composed of the Soil-Landscape Estimation and Evaluation Program (SLEEP) and a calibrated Gradient Boosting Model (GBM) capable of modeling the spatial distribution of soil properties at multiple and dynamic soil depths. The use of SLEEP and GBM allowed us to explain the spatial distribution of various soil properties and their environmental modulators. The model training and testing approach used six topographical, ten meteorological and two vegetation properties, and data from 223 soil profiles across the study area. Our models demonstrated a consistent performance with spatial extrapolations exhibiting r2 values of 0.79–0.98, and -1.39–1.14% percent bias. The properties related to topography and climate were dominating when estimating the number of soil layers, soil texture, and the sum of bases. Our framework features high flexibility and it is transferable to other tropical regions, while reducing capital investments and increasing accuracy when compared to traditional mapping protocols.

DOI

https://doi.org/10.31223/X57P9W

Subjects

Environmental Monitoring, Soil Science, Statistical Models

Keywords

Gradient Boosting Model, Decision trees, Sleep, Soil properties, tropics, Pernambuco.

Dates

Published: 2022-07-22 16:15

Last Updated: 2023-04-12 01:42

Older Versions
License

CC BY Attribution 4.0 International

Additional Metadata

Conflict of interest statement:
None.

Data Availability (Reason not available):
https://zenodo.org/record/5918544