Preview

GEOGRAPHY, ENVIRONMENT, SUSTAINABILITY

Advanced search

Individual And Pairwise Representativeness Of Sampling Points In Interpolation Tasks Of Heavy Metals Distribution In The Topsoil

https://doi.org/10.24057/2071-9388-2025-3240

Abstract

The optimization of environmental soil monitoring based on representative selection of a training subset for an artificial neural network is an unresolved problem in the tasks of interpolation of the distribution of metals in the topsoil. The soil survey data, often used as input for artificial neural network modeling, are datasets at irregular points. Usually, the division of the input data into training and test subsets is carried out randomly in a ratio of 70% to 30% points, respectively. The question of the individual and collective representativeness of local sampling points on the element content in the soil in a given area for a training subset remains beyond the scope of interpolation problems. In this work, the representativeness of the sampling points plays a crucial role in reducing the ANN error and enhancing the correlation between the results of model calculations on the test subset and natural measurements when the points are part of the training subset. When evaluating the pairwise representativeness, we found two types of effects: synergy and anti-synergy. The synergy was achieved with an increase in model accuracy when the pair entered the training subset. The anti-synergy manifested in a decrease informativeness of the point pair for modeling. The various sampling locations have different information and unequal meaning for feature interpolation. The scale-free network structures were found to have pairwise representativeness by RMSE.

About the Authors

Elena M. Baglaeva
Institute of Industrial Ecology UB RAS
Russian Federation

S. Kovalevskaya str., 20, Ekaterinburg, 620990



Aleksandr P. Sergeev
Institute of Industrial Ecology UB RAS
Russian Federation

S. Kovalevskaya str., 20, Ekaterinburg, 620990



Andrey V. Shichkin
Institute of Industrial Ecology UB RAS
Russian Federation

S. Kovalevskaya str., 20, Ekaterinburg, 620990



Alexander G. Buevich
Institute of Industrial Ecology UB RAS
Russian Federation

S. Kovalevskaya str., 20, Ekaterinburg, 620990



References

1. Baglaeva E.M., Sergeev A.P., Shichkin A.V., Buevich A. G. (2020). The Effect of Splitting of Raw Data into Training and Test Subsets on the Accuracy of Predicting Spatial Distribution by a Multilayer Perceptron. Math. Geosci., 52, 111–121.

2. Baglaeva E.M., Sergeev A.P., Shichkin A.V., Buevich A.G. (2021). The Extraction of the training subset for the spatial distribution modelling of the heavy metal in topsoil. Catena, 207, 105699. https://doi.org/10.1016/j.catena.2021.105699

3. Boussange V., Pellissier L. (2022). Eco-evolutionary model on spatial graphs reveals how habitat structure affects phenotypic differentiation. Communications Biology, 5, 668. https://doi.org/10.1038/s42003-022-03595-3

4. Dale M.R. and Fortin M. (2010). From graphs to spatial graphs. Annu. Rev. Ecol. Evol. Syst., 41, 21–38.

5. Demyanov V., Gloaguen E., Kanevski M. (2020). A special issue on data science for geosciences. Math. Geosci., 52, 1–3.

6. Fernandez Jaramillo J.M. and Mayerle R. (2018). Sample selection via angular distance in the space of the arguments of an artificial neural network. Computers and Geosciences, 114, 98–106.

7. Gutierrez-Velez V.H. and Wiese D. (2020). Sampling bias mitigation for species occurrence modeling using machine learning methods. Ecological Informatics, 58, 101091. https://doi.org/10.1016/j.ecoinf.2020.101091.

8. Kramm T. and Hoffmeister D. (2020). Assessing the influence of environmental factors and datasets on soil type prediction with two machine learning algorithms in a heterogeneous area in the Rur catchment, Germany. Geoderma Regional, 22, e00316. https://doi.org/10.1016/j.geodrs.2020.e00316.

9. Legendre P., Dale M.R.T., Fortin M.J., et. al. (2004). Effects of spatial structures on the results of field experiments. Ecology, 85(12), 3202–3214.

10. Levin S.A. (2002). Complex adaptive systems: Exploring the known, the unknown and the unknowable. Bull. Am. Math. Soc., 40, 3–20.

11. Liu Q., Li H., Guo L., et. al. (2022). Digital mapping of soil organic carbon density using newly developed bare soil spectral indices and deep neural network. Catena, 219, 106603. https://doi.org/10.1016/j.catena.2022.106603.

12. Malof J.M., Reichman D., Collins L.M. (2018). How do we choose the best model? The impact of cross-validation design on model evaluation for buried threat detection in ground penetrating radar. Proceedings. 10628, Detection and Sensing of Mines, Explosive Objects, and Obscured Targets XXIII, 106280C. https://doi.org/10.1117/12.2305793

13. de Mello D.C., Ferreira T.O., Veloso G.V., et. al. (2022). Pedogenetic processes operating at different intensities inferred by geophysical sensors and machine learning algorithms. Catena, 216, Part A, 106370, ISSN 0341-8162. https://doi.org/10.1016/j.catena.2022.106370.

14. Nath A. and Subbiah K. (2018). The role of pertinently diversified and balanced training as well as testing data sets in achieving the true performance of classifiers in predicting the antifreeze proteins. Neurocomputing, 272, 294–305.

15. O’Brien D., Manseau M., Fall A., Fortin M-J. (2006). Testing the importance of spatial configuration of winter habitat for woodland caribou: An application of graph theory. Biological Conservation, 130, 70–83.

16. O’Hare M.T., Gunn I.D.M., Critchlow-Watton N., et. al. (2020). Fewer sites but better data? Optimising the representativeness and statistical power of a national monitoring network. Ecological Indicators, 114, 106321. https://doi.org/10.1016/j.ecolind.2020.106321.

17. Pesch R., Schröder W., Dieffenbach-Fries H., et. al. (2008). Improving the design of environmental monitoring networks. Case study on the heavy metals in mosses survey in Germany. Ecological Informatics, 3(1), 111 121. https://doi.org/10.1016/j.ecoinf.2007.11.001.

18. Prager S.D. and Reiners W.A. (2009). Historical and emerging practices in ecological topology. Ecological complexity, 6, 160–171. doi:10.1016/j.ecocom.2008.11.001

19. Shichkin A.V., Buevich A.G., Sergeev A.P., et. al. (2018). Prediction of the content of anomalously distributed chromium in the soil by hybrid models based on artificial neural networks. Geoecology. Engineering geology. Hydrogeology. Geocryology, 3, 86 96. [in Russian] Urban D.L., Minor E.S., Treml E.A., Schick R.S. (2009). Graph models of habitat mosaics. Ecology Letters, 12, 260–73.

20. Wang I.J. and Bradburd G.S. (2014). Isolation by environment. Molecular Ecology, https://doi.org/10.1111/mec.12938

21. Wang X., An Sh., Xu Y., et. al. (2020). A back propagation neural network model optimized by mind evolutionary algorithm for estimating Cd, Cr, and Pb concentrations in soils using Vis-NIR diffuse reflectance spectroscopy. Applied Sciences, 10(51), 1 17. https://doi:10.3390/app10010051

22. Wang Y., Ma H., Wang J., et. al. (2021). Hyperspectral monitor of soil chromium contaminant based on deep learning network model in the Eastern Junggar coalfield. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 257, 119739. https://doi.org/10.1016/j.saa.2021.119739.

23. Xu Q., Zhu A-X., Liu J. (2023). Land-use change modeling with cellular automata using land natural evolution unit. Catena, 224, 106998. https://doi.org/10.1016/j.catena.2023.106998.

24. Zhu A.X., Liu J., Du F., et.al. (2015). Predictive soil mapping with limited sample data. European Journal of Soil Science. 66, 535–547. doi: 10.1111/ejss.12244

25. Zhu A.X., Lu G., Liu J., et. al. (2018). Spatial prediction based on Third Law of Geography. Annals of GIS, 24 (4), 225–240. https://doi.org/10.1080/19475683.2018.1534890

26. Zhu A.X., Lv G.N., Zhou C.H., et al. (2020). Geographic similarity: Third Law of Geography? Journal of Geoinformation Science, 22(4), 673–679. https://doi.org/10.12082/dqxxkx.2020.200069.

27. Zhong L., Guo X., Xu Zh., Ding M. (2021). Soil properties: Their prediction and feature extraction from the LUCAS spectral library using deep convolutional neural networks. Geoderma, 402, 115366.

28. Ziggah Y.Y., Youjian H., Tierra A.R., Laari P.B. (2019). Coordinate Transformation between Global and Local Data Based on Artificial Neural Network with K-Fold Cross-Validation in Ghana. Earth Sciences Research Journal, 23(1), 67 77. https://doi.org/10.15446/esrj.v23n1.63860.


Review

For citations:


Baglaeva E.M., Sergeev A.P., Shichkin A.V., Buevich A.G. Individual And Pairwise Representativeness Of Sampling Points In Interpolation Tasks Of Heavy Metals Distribution In The Topsoil. GEOGRAPHY, ENVIRONMENT, SUSTAINABILITY. 2025;18(1):6-13. https://doi.org/10.24057/2071-9388-2025-3240

Views: 355


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2071-9388 (Print)
ISSN 2542-1565 (Online)