-
ozone-imputation-data
Data of the ozone imputation software (https://gitlab.jsc.fz-juelich.de/esde/machine-learning/ozone-imputation), see README of this repository for usage and information.... -
Gridded data for the AQ-Bench dataset
This is a gridded version of the AQ-Bench variables. The AQ-Bench dataset and its documentation is available via Betancourt et al. (2021): "AQ-Bench: a benchmark dataset for... -
Data Samples for temperature forecasting by deep learning methods
Here we provide the data samples (one-year data) to allow the users to fast test the machine learning workflow code that is published on Zenodo... -
Bulk chemistry measurements and XRF spectra of sediments from the high latitu...
This dataset has no description
-
Quantified high-resolution TOC and CaCO3 contents of sediments from the high ...
This dataset has no description
-
Bulk chemistry measurements of sediments from the high latitude sectors of Pa...
This dataset has no description
-
SiDroForest: Synthetic Siberian Larch Tree Crown Dataset of 10.000 instances ...
This synthetic Siberian Larch tree crown dataset was created for upscaling and machine learning purposes as a part of the SiDroForest (Siberia Drone Forest Inventory) project.... -
XRF down-core scanning and bulk chemistry measurements of sediments from the ...
Thirty marine sediment cores were collected from the subarctic Northwest Pacific (cruise SO264 with R/V SONNE in 2018), the central Drake Passage (cruise PS97 with RV Polarstern... -
Catchment areas of PAP sediment traps at 3000m depth from 2000 to 2022
The gravitational pump plays a significant role in the carbon cycle by exporting sinking organic carbon from the surface to the deep ocean. The deep sediment time-series traps... -
Adaptive energy reference for machine-learning models of the electronic densi...
The electronic density of states (DOS) provides information regarding the distribution of electronic states in a material, and can be used to approximate its optical and... -
FINALES - Electrolyte optimization for maximum conductivity and for maximum c...
This study investigates an electrolyte system composed of lithium hexafluorophosphate (LiPF6), ethylene carbonate (EC) and ethyl methyl carbonate (EMC). For the assembly of full... -
Global Age Mapping Integration (GAMI)
GAMI is an updated dataset providing global forest age distributions for 2010 and 2020 with 100-meter resolution, improving upon the MPI-BGC forest age product. Utilizing... -
DIGIRES COVID-19 ML Dataset v.1
DIGIRES COVID-19 ML dataset v.1 is a tab-separated (.tsv) file prepared for training machine learning algorithms. The training dataset was compiled from various internet public... -
Acoustic Data Building Toolset
This folder contains data and software tools (in python) that can be used in experiments with phoneme recognition in speech samples recorder in Polish. Acoustic data used here... -
Terminological dictionary of artificial intelligence
The terminological dictionary was compiled within the framework of the project Development of Slovene in the Digital Environment. It is an example collection of 413 terms from... -
WMT16 Quality Estimation Shared Task Training and Development Data
Training and development data for the WMT16 QE task. Test data will be published as a separate item. This shared task will build on its previous four editions to further examine... -
WMT18 Quality Estimation Shared Task Training and Development Data
Training and development data for the WMT18 QE task. Test data will be published as a separate item. This shared task will build on its previous six editions to further examine... -
Corpus of contemporary blogs
In NLP Centre, dividing text into sentences is currently done with a tool which uses rule-based system. In order to make enough training data for machine learning, annotators... -
WMT16 APE Shared Task Data - Reference sentences
Training, development and test data consist in German sentences belonging to the IT domain and already tokenized. These sentences are the references of the data released for the... -
SnakeCLEF 2021
The dataset with 409,679 images belonging to 772 snake species from 188 countries and all continents (386,006 images with labels targeted for development and 23,673 images...