This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.
Rapid Estimation of Soil Profile Arsenic Content and Identification of Substitute Indicators Using Random Forest: A Case Study of Nenjiang City, China
Downloads
Authors
Abstract
Direct measurement of soil arsenic (As) requires complex analytical procedures, motivating the development of predictive methods based on readily measurable indicators. This study systematically analyzed 50 geochemical indicators using 204 soil samples collected from 69 sampling sites across a 0–500 cm depth profile in Nenjiang County, Heilongjiang Province. A random forest (RF) algorithm was employed to construct a surrogate model for As content using routine indicators as inputs. Model performance was validated using an independent test set, five-fold cross-validation, spatial block cross-validation, and vertical profile comparison plots. The results showed: (1) As content was enriched in the surface layer and fluctuated with depth, exhibiting vertical differentiation. (2) The RF model achieved a test set R² of 0.68 and RMSE of 2.21 μg/g, with a cross-validation R² of 0.66±0.05, indicating model stability. (3) Feature importance analysis revealed that antimony (Sb) was the best substitute indicator (importance 0.669), followed by N, Pb, Mo, and TFe₂O₃; depth contributed negligibly. (4) Comparative plots of typical profiles showed high consistency between estimated and measured trends. By using easily measurable indicators such as Sb, this model provides a reliable surrogate approach for estimating As content in soil profiles, supporting rapid contamination screening and geochemical process understanding.
DOI
https://doi.org/10.31223/X55R42
Subjects
Environmental Sciences
Keywords
Random Forest, Soil profile, Arsenic (As), Surrogate model, Antimony (Sb), Nenjiang County
Dates
Published: 2026-07-03 22:01
Last Updated: 2026-07-03 22:01
License
CC BY Attribution 4.0 International
Metrics
Views: 44
Downloads: 4
There are no comments or no comments have been made public for this article.