This is a Preprint and has not been peer reviewed. The published version of this Preprint is available: http://doi.org/10.1016/j.earscirev.2020.103359. This is version 1 of this Preprint.
Downloads
Authors
Abstract
The uptake of machine learning (ML) algorithms in digital soil mapping (DSM) is transforming the way soil scientists produce their maps. Machine learning is currently applied to mapping soil properties or classes much in the same way as other unrelated fields of science. Mapping of soil, however, has unique aspects which require adaptations of the ML algorithms. These features are for example, but not limited to, the inclusion of pedological knowledge into the ML algorithm, the accounting of spatial structure present in the soil data, or the desire to increase our scientific understanding of the distribution and genesis of soil from a calibrated ML model. Tackling these challenges is critical for machine learning to gain credibility and scientific consistency in soil science. In this article, we review the current applications of machine learning in digital soil mapping and suggest improvements. We found a growing interest of the use of ML in DSM. Most studies focus on obtaining accurate maps and disregard the characteristics of soil data, such as spatial autocorrelation. Only a few studies account for existing soil knowledge or quantify the uncertainty of the predicted maps. We then discuss the challenges related to the application of ML for soil mapping and offer solutions from existing studies in the natural sciences. The challenges are organized as follows: sampling, resampling, accounting for the spatial information, multivariate mapping, uncertainty analysis, validation, integration of pedological knowledge and, interpretation of the models. We conclude that for future developments, machine learning should incorporate three core elements: plausibility, interpretability, and explainability, which will trigger soil scientists to move beyond model prediction and towards explanation of soil processes.
DOI
https://doi.org/10.31223/osf.io/8eq6s
Subjects
Agriculture, Life Sciences, Other Life Sciences
Keywords
geostatistics, Random Forest, data mining, pedometrics, soil science, spatial data
Dates
Published: 2020-02-06 06:38
There are no comments or no comments have been made public for this article.