Soil properties predicted on mid-infrared (MIR) spectroscopy measurements in North-Western Kurdistan region, Iraq

DOI

Soil information is valuable for many disciplines (e.g. agriculture, geomorphology, geology, archaeology) and can be used to produce maps or statistics on soil productivity. As part of the project CRC1070 ResourceCulture, we collected information on the soil quality in the Dohuk province of the Kurdistan region of Iraq. In total, 561 samples were collected at 136 locations in 2017, 2018, 2022, and 2023. These samples were collected at different depth increments (0 - 10, 10 - 30, 30 - 50, 50 - 70 and 70 - 100 cm) with an auger before being prepared and measured with mid-infrared (MIR) spectroscopy. Part of these samples (109) were selected to be analyzed in a laboratory, measuring texture, pH, organic and total carbon, nitrogen, sulfur, electrical conductivity, bulk density and calcium carbonate. A Cubist model was used to predict the remaining samples based on the MIR spectra. We then modelled digital soil mapping with machine learning methods (ensemble learning, linear regression, decision trees) for these soil components. Additionally, we mapped the soil depth using the information collected in the field. This dataset can help any researcher regarding soil information, forming a unique regional database.For the MIR spectra prediction data the samples were air dried (35 - 45 °C) for 24 h; root fragments were removed, sieved (< 2 mm), and ground below 1 µm with a (Fritsch, Pulverisette 5/4, classic line). The samples were measured in MIR spectroscopy with wavenumber (Bruker, Vertex 80v-mir) from 375 - 4,995 cm-1 in absorbance, with a 4 cm-1 interval. Spectra between 350 – 499 cm-1 and 2,451 – 2,500 cm-1 were removed for low signal interference during prediction. Some spectra were transformed according to the literature for better prediction results. The predicted values are based on a Cubist model computed on the raw spectra (pH, MWD), Savitzky-Golay transform spectra with second polynomial order and a window size of eleven (CaCO3, Clay), or Standard Normal Variate of the Savitzky-Golay transform spectra with second polynomial order and a window size of eleven (Nt, Ct, Corg, Ec, Sand, Silt). pH is expressed in absolute value, MWD in mm, Nt, Ct, Corg, Sand, Silt and Clay in %, EC in µS/cm, and wave-number in absorbance in 1/cm.

Identifier
DOI https://doi.pangaea.de/10.1594/PANGAEA.973700
Related Identifier References https://doi.pangaea.de/10.1594/PANGAEA.973764
Metadata Access https://ws.pangaea.de/oai/provider?verb=GetRecord&metadataPrefix=datacite4&identifier=oai:pangaea.de:doi:10.1594/PANGAEA.973700
Provenance
Creator Bellat, Mathias; Glissmann, Benjamin; Rentschler, Tobias ORCID logo; Sconzo, Paola; Pfälzner, Peter; Brifkany, Bekas; Scholten, Thomas ORCID logo
Publisher PANGAEA
Publication Year 2024
Funding Reference German Research Foundation https://doi.org/10.13039/501100001659 Crossref Funder ID 215859406 https://gepris.dfg.de/gepris/projekt/240000619 A hunt for resources? Spatial models in the ResourceCultures at the northern periphery of Mesopotamia (B07)
Rights Creative Commons Attribution 4.0 International; https://creativecommons.org/licenses/by/4.0/
OpenAccess true
Representation
Resource Type Dataset
Format text/tab-separated-values
Size 585944 data points
Discipline Earth System Research
Spatial Coverage (42.420W, 36.781S, 43.038E, 37.207N); Dohuk directorate, Iraq
Temporal Coverage Begin 2017-09-01T00:00:00Z
Temporal Coverage End 2023-10-03T00:00:00Z