The main component of this data publication is a dataset of predicted daily nutrient concentrations for NO3-N and TP for 150 monitoring stations along 60 German rivers (main rivers). The aim of this dataset is to fill the data gap of daily nutrient concentrations for a better understanding of nutrient transport from the rivers to the seas. So far, nutrient concentrations are sampled on a fortnightly basis, which can be insufficient for nutrient retention models working on a daily basis. With this method and available datasets, river basin managers have the opportunity to look at nutrient concentrations or load patterns on a finer resolution to adapt their management to improve water quality.
The dataset was obtained by a random forest model (RF) based on measured NO3-N and TP concentrations between the years 2000 and 2019. The data was requested or where available downloaded from official websites of the Federal States or River Basins. Different variables for NO3-N and TP were finally considered in the models to produce the RF, like discharge, land use, day of the year.
The following data is found in the data download zip file:
Dataset as csv: Dataset of predicted daily nutrient concentrations for NO3-N and TP for 150 monitoring stations along 60 German rivers.
Figures as pdf: Comparison of predicted values based on different distributions (mean and mode) for annual cycles of NO3-N and TP concentrations and loads for 150 locations along 60 rivers in Germany.
Coding of monitoring stations as csv: The basic step for the analysis was finding pairs of gauges and water quality stations. These pairs were then coded and used in the model as ID. This coding file contains the names of monitoring stations and gauges for each ID as well as the number of NO3-N, TP concentrations, discharges and years applied.
Variable importance as figure and explanation as csv: Several RF variants with different sets of variables were built. Starting with 11 variables and iteratively considering which were most important. Variables are explained in a csv and their importance for each variant is shown in the figure.
Random Forests for TP and NO3-N as R.data: The best performing Random Forests for NO3-N (variant 7) and TP (variant 1) are stored as R.data files for further application.