Transferability of machine learning models for digital soil mapping

Applicant Professor Dr. Thomas Scholten

Subject Area Soil Sciences

Term since 2021

Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 448762063

Project Description

Machine learning (ML) models have shown great success in learning complex spatial patterns of soil formation and soil properties and are enabled to make predictions about unobserved soil data. In contrast, the ability to transfer what has been learned to other areas is much less developed and so far, the models can only be transferred to areas outside the immediate learning environment to a very limited extent. Similar to empirical regressions, the sets of rules, for example, in decision tree procedures such as Random Forest, only apply to the value range covered by training data. Advances in the field of Deep Learning (DL), e.g. Convolutional Neural Networks, Transfer Learning and combined approaches in the field of Feature Selection (FS) offer extended possibilities here to limit dimensionality especially in smaller data sets, to minimize over-adaptation to training data and to improve transfer to adjacent areas. In the present proposal we address these developments and try to predict soil properties also for areas outside the learning environment. To this end, we use environmental factors to create an area-specific parameterization of machine learning models using geomorphometric, geological, landscape ecological and climate parameters. Which parameters these are in detail and how they relate to each other will be calculated exemplarily for different test data sets in Germany (humid climate) and in Iran (semi-arid to arid climate) by combining methods of DL and FS. In the following step, the models trained with the selected covariates of the environmental pattern analysis and the soil profile data are transferred to non-trained areas and validated on independent soil data. The untrained areas are characterized by distance and similarity metrics with regard to their comparability with the original training areas in order to assess the transfer performance of the machine learning models. Finally, it is planned to gradually add training data for the unknown areas in order to quantify the development of the prediction accuracy and to assess the transfer properties of different ML methods. Training data will be LUCAS data for Germany and soil profile data from the national SPDB database for Iran. The environmental parameters are derived from satellite data, digital elevation models, world climate data and geological and land use maps. Soil properties to be tested are soil carbon content, soil texture, carbonate content and cation exchange capacity. 12 ML methods are used for comparison.

DFG Programme Research Grants

Servicenavigation

Hauptnavigation

Transferability of machine learning models for digital soil mapping

Additional Information

Servicenavigation

Hauptnavigation

Transferability of machine learning models for digital soil mapping

Additional Information

Textvergrößerung und Kontrastanpassung