Advanced Search
Spatial prediction of soil organic matter based on feature screening and random forests
Received:June 16, 2025  
View Full Text  View/Add Comment  Download reader
KeyWord:soil organic matter;characterization screening;random forest;spatial prediction
Author NameAffiliationE-mail
JIAO Yangqing School of Earth and Environmental Sciences, Anhui University of Science and Technology, Huainan 232001, China  
ZHANG Shiwen School of Earth and Environmental Sciences, Anhui University of Science and Technology, Huainan 232001, China shwzhang@aust.edu.cn 
YAN Fang Beijing Centre for Cultivated Land Construction and Protection, Beijing 100101, China  
WANG Shengtao Beijing Centre for Cultivated Land Construction and Protection, Beijing 100101, China  
ZHAO Baoyu School of Earth and Environmental Sciences, Anhui University of Science and Technology, Huainan 232001, China  
Hits: 1092
Download times: 810
Abstract:
      In order to investigate the effects of various environmental variables on the spatial distribution of soil organic matter predicted by the random forest model, this study chose various environmental variable types for combination and optimization. The goal was to minimize the detrimental effects of redundant characteristic variables on the random forest model. In order to build a random forest prediction model of soil organic matter with various combinations of environmental variables, topographic factors, climate factor, vegetation factors, soil properties, and anthropogenic factors were chosen and prioritized. Spearman correlation analysis and importance analysis were used to choose the best set of environmental factors. The findings demonstrated that the random forest prediction model with anthropogenic influences and soil characteristics as input variables produced superior outcomes, with root mean square error and coefficient of determination of 4.387 g·kg-1 and 0.802, respectively; With an coefficient of determination of 0.747, the prediction accuracy was lowest when climate factor were employed as independent inputs; The random forest prediction model's findings were optimized with root mean square error and coefficient of determination of 2.785 g·kg-1 and 0.911, respectively, following feature screening to eliminate redundant features; Topography was the primary determinant of soil organic matter in the research area, according to the findings of correlation and significance analyses. The random forest model outperforms the conventional ordinary kriging, regression kriging, and geographically weighted regression kriging models in terms of prediction accuracy. The accuracy of the random forest prediction model may be effectively increased by characteristic filtered environmental variables. Using only elevation, slope, soil parent material, and mean annual precipitation with fewer variables, the spatial prediction accuracy of soil organic matter content in the studied area exceeded 0.8.