-
Data Samples for temperature forecasting by deep learning methods
Here we provide the data samples (one-year data) to allow the users to fast test the machine learning workflow code that is published on Zenodo... -
Catchment areas of PAP sediment traps at 3000m depth from 2000 to 2022
The gravitational pump plays a significant role in the carbon cycle by exporting sinking organic carbon from the surface to the deep ocean. The deep sediment time-series traps... -
Adaptive energy reference for machine-learning models of the electronic densi...
The electronic density of states (DOS) provides information regarding the distribution of electronic states in a material, and can be used to approximate its optical and... -
FINALES - Electrolyte optimization for maximum conductivity and for maximum c...
This study investigates an electrolyte system composed of lithium hexafluorophosphate (LiPF6), ethylene carbonate (EC) and ethyl methyl carbonate (EMC). For the assembly of full... -
Global Age Mapping Integration (GAMI)
GAMI is an updated dataset providing global forest age distributions for 2010 and 2020 with 100-meter resolution, improving upon the MPI-BGC forest age product. Utilizing... -
DIGIRES COVID-19 ML Dataset v.1
DIGIRES COVID-19 ML dataset v.1 is a tab-separated (.tsv) file prepared for training machine learning algorithms. The training dataset was compiled from various internet public... -
Acoustic Data Building Toolset
This folder contains data and software tools (in python) that can be used in experiments with phoneme recognition in speech samples recorder in Polish. Acoustic data used here... -
Terminological dictionary of artificial intelligence
The terminological dictionary was compiled within the framework of the project Development of Slovene in the Digital Environment. It is an example collection of 413 terms from... -
WMT16 Quality Estimation Shared Task Training and Development Data
Training and development data for the WMT16 QE task. Test data will be published as a separate item. This shared task will build on its previous four editions to further examine... -
WMT18 Quality Estimation Shared Task Training and Development Data
Training and development data for the WMT18 QE task. Test data will be published as a separate item. This shared task will build on its previous six editions to further examine... -
Corpus of contemporary blogs
In NLP Centre, dividing text into sentences is currently done with a tool which uses rule-based system. In order to make enough training data for machine learning, annotators... -
WMT16 APE Shared Task Data - Reference sentences
Training, development and test data consist in German sentences belonging to the IT domain and already tokenized. These sentences are the references of the data released for the... -
SnakeCLEF 2021
The dataset with 409,679 images belonging to 772 snake species from 188 countries and all continents (386,006 images with labels targeted for development and 23,673 images... -
WMT16 APE Shared Task Data
Training, development and text data (the same used for the Sentence-level Quality Estimation task) consist in English-German triplets (source, target and post-edit) belonging to... -
WMT17 Quality Estimation Shared Task Training and Development Data
Training and development data for the WMT17 QE task. Test data will be published as a separate item. This shared task will build on its previous five editions to further examine... -
WMT18 Quality Estimation Shared Task Test Data
Test data for the WMT18 QE task. Train data can be downloaded from http://hdl.handle.net/11372/LRT-2619. This shared task will build on its previous six editions to further... -
WMT17 Quality Estimation Shared Test Data
Test data for the WMT17 QE task. Train data can be downloaded from http://hdl.handle.net/11372/LRT-1974 This shared task will build on its previous five editions to further... -
Datasets for "EmbryoNet: Using deep learning to link embryonic phenotypes to ...
This is the data repository of the training and test data sets for EmbryoNet. The data is structured in multiple packages. EmbryoNet_Models (DOI 10.48606/31) contains the... -
Soil bulk density and soil depth from on-site observations in the North-Weste...
Soil information is valuable for many disciplines (e.g. agriculture, geomorphology, geology, archaeology) and can be used to produce maps or statistics on soil productivity. As... -
Datasets for "Uncovering developmental time and tempo using deep learning"
This is the data repository for training and testing the Twin Network. The imaging data repositories are divided into several packages based on independent experiments. The data...