-
Post-edited and error annotated machine translation corpus PErr 1.0
The PE²rr corpus contains source language texts from different domains along with their automatically generated translations into several morphologically rich languages, their... -
NeMo Neural Machine Translation service RSDO-DS4-NMT-API 1.0
Neural Machine Translation service for NeMo AAYN Base models. For more details about building such models, see the official NVIDIA NeMo documentation... -
Machine Translation datasets from the KAS corpus KAS-MT 1.0
The Machine Translation datasets KAS-MT 1.0 contain automatically sentence-aligned Slovene and English plain-text abstracts from KAS-Abs 2.0 (http://hdl.handle.net/11356/1449)... -
Parallel corpus EN-SL RSDO4 1.0
The RSDO4 parallel corpus of English-Slovene and Slovene-English translation pairs was collected as part of work package 4 of the Slovene in the Digital Environment project. It... -
Parallel corpus EN-SL RSDO4 2.0
The RSDO4 parallel corpus of English-Slovene and Slovene-English translation pairs was collected as part of work package 4 of the Slovene in the Digital Environment project. It... -
Neural Machine Translation model for Slovene-English language pair RSDO-DS4-N...
This Neural Machine Translation model for Slovene-English language pair was trained following the NVIDIA NeMo NMT AAYN recipe (for details see the official NVIDIA NeMo NMT... -
WMT16 Quality Estimation Shared Task Training and Development Data
Training and development data for the WMT16 QE task. Test data will be published as a separate item. This shared task will build on its previous four editions to further examine... -
Czech image captioning, machine translation, and sentiment analysis (Neural M...
This submission contains trained end-to-end models for the Neural Monkey toolkit for Czech and English, solving three NLP tasks: machine translation, image captioning, and... -
Ptakopět data: the dataset for experiments on outbound translation
The dataset used for the Ptakopět experiment on outbound machine translation. It consists of screenshots of web forms with user queries entered. The queries are available also... -
APE Shared Task WMT17: Human Post-edits Test Data DE-EN
Human post-edited test sentences for the WMT 2017 Automatic post-editing task. This consists in 2,000 English sentences belonging to the IT domain and already tokenized. Source... -
Manually Ranked Translation Outputs
Manually ranked outputs of Czech-Slovak translations. Three annotators manually ranked outputs of five MT systems (Česílko, Česílko2, Google Translate and two Moses setups) on... -
Test Data DE-EN APE Shared Task WMT17
Test data for the WMT 2017 Automatic post-editing task (the same used for the Sentence-level Quality Estimation task). They consist in German-English triplets (source and... -
ParaCrawl Corpus version 1.0
The January 2018 release of the ParaCrawl is the first version of the corpus. It contains parallel corpora for 11 languages paired with English, crawled from a large number of... -
MCSQ Translation Models (en-de) (v1.0)
En-De translation models, exported via TensorFlow Serving, available in the Lindat translation service (https://lindat.mff.cuni.cz/services/translation/). The models were... -
Automatic Paraphrases of Czech Reference Sentences for WMT11, 13 and 14
This dataset contains automatic paraphrases of Czech official reference translations for the Workshop on Statistical Machine Translation shared task. The data covers the years... -
Moses Web Demo
An interactive web demo of selected ÚFAL MT systems. -
Test Data EN-DE MT_NMT APE Shared Task WMT18
Test data for the WMT 2018 Automatic post-editing task. They consist in English-German pairs (source and target) belonging to the information technology domain and already... -
APE Shared Task WMT17: Human Post-edits Test Data EN-DE
Human post-edited test sentences for the WMT 2017 Automatic post-editing task. This consists in 2,000 German sentences belonging to the IT domain and already tokenized. Source... -
FAUST cs-en 0.5
This machine translation test set contains 2223 Czech sentences collected within the FAUST project (https://ufal.mff.cuni.cz/grants/faust, http://hdl.handle.net/11234/1-3308).... -
Khresmoi Summary Translation Test Data 2.0
This package contains data sets for development (Section dev) and testing (Section test) of machine translation of sentences from summaries of medical articles between Czech,...