Machine learning for digital soil mapping: applications, challenges and suggested solutions

This is a Preprint and has not been peer reviewed. The published version of this Preprint is available: http://doi.org/10.1016/j.earscirev.2020.103359. This is version 1 of this Preprint.

Add a Comment

You must log in to post a comment.


Comments

There are no comments or no comments have been made public for this article.

Downloads

Download Preprint

Authors

Alexandre M.J.-C. Wadoux, Budiman Minasny, Alex McBratney

Abstract

The uptake of machine learning (ML) algorithms in digital soil mapping (DSM) is transforming the way soil scientists produce their maps. Machine learning is currently applied to mapping soil properties or classes much in the same way as other unrelated fields of science. Mapping of soil, however, has unique aspects which require adaptations of the ML algorithms. These features are for example, but not limited to, the inclusion of pedological knowledge into the ML algorithm, the accounting of spatial structure present in the soil data, or the desire to increase our scientific understanding of the distribution and genesis of soil from a calibrated ML model. Tackling these challenges is critical for machine learning to gain credibility and scientific consistency in soil science. In this article, we review the current applications of machine learning in digital soil mapping and suggest improvements. We found a growing interest of the use of ML in DSM. Most studies focus on obtaining accurate maps and disregard the characteristics of soil data, such as spatial autocorrelation. Only a few studies account for existing soil knowledge or quantify the uncertainty of the predicted maps. We then discuss the challenges related to the application of ML for soil mapping and offer solutions from existing studies in the natural sciences. The challenges are organized as follows: sampling, resampling, accounting for the spatial information, multivariate mapping, uncertainty analysis, validation, integration of pedological knowledge and, interpretation of the models. We conclude that for future developments, machine learning should incorporate three core elements: plausibility, interpretability, and explainability, which will trigger soil scientists to move beyond model prediction and towards explanation of soil processes.

DOI

https://doi.org/10.31223/osf.io/8eq6s

Subjects

Agriculture, Life Sciences, Other Life Sciences

Keywords

geostatistics, Random Forest, data mining, pedometrics, soil science, spatial data

Dates

Published: 2020-02-06 14:38

License

CC BY Attribution 4.0 International