Skip to main content
Rapid Estimation of Soil Profile Arsenic Content and Identification of Substitute Indicators Using Random Forest: A Case Study of Nenjiang City, China

Rapid Estimation of Soil Profile Arsenic Content and Identification of Substitute Indicators Using Random Forest: A Case Study of Nenjiang City, China

This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.

Add a Comment

You must log in to post a comment.


Comments

There are no comments or no comments have been made public for this article.

Downloads

Download Preprint

Authors

Youtao Xin , Zhongkai Liang, Shaowen Li, Haonan Song, Guoqiang Zhao, Yanfeng Sun

Abstract

Direct measurement of soil arsenic (As) requires complex analytical procedures, motivating the development of predictive methods based on readily measurable indicators. This study systematically analyzed 50 geochemical indicators using 204 soil samples collected from 69 sampling sites across a 0–500 cm depth profile in Nenjiang County, Heilongjiang Province. A random forest (RF) algorithm was employed to construct a surrogate model for As content using routine indicators as inputs. Model performance was validated using an independent test set, five-fold cross-validation, spatial block cross-validation, and vertical profile comparison plots. The results showed: (1) As content was enriched in the surface layer and fluctuated with depth, exhibiting vertical differentiation. (2) The RF model achieved a test set R² of 0.68 and RMSE of 2.21 μg/g, with a cross-validation R² of 0.66±0.05, indicating model stability. (3) Feature importance analysis revealed that antimony (Sb) was the best substitute indicator (importance 0.669), followed by N, Pb, Mo, and TFe₂O₃; depth contributed negligibly. (4) Comparative plots of typical profiles showed high consistency between estimated and measured trends. By using easily measurable indicators such as Sb, this model provides a reliable surrogate approach for estimating As content in soil profiles, supporting rapid contamination screening and geochemical process understanding.

DOI

https://doi.org/10.31223/X55R42

Subjects

Environmental Sciences

Keywords

Random Forest, Soil profile, Arsenic (As), Surrogate model, Antimony (Sb), Nenjiang County

Dates

Published: 2026-07-03 22:01

Last Updated: 2026-07-03 22:01

License

CC BY Attribution 4.0 International

Metrics

Views: 44

Downloads: 4