-
Corpus of Romanian Academic Genres ROGER
The corpus contains academic papers from eight disciplines, written by the Romanian students in native Romanian and English L2. The corpus was collected over a three-year period... -
Abstracts from the KAS corpus KAS-Abs 1.0
The KAS-abs corpus contains 108,254 automatically identified Slovenian and/or English abstracts (30 million words) from 62,000 BSc/BA, MSc/MA, and PhD theses included in the KAS... -
Summarization datasets from the KAS corpus KAS-Sum 1.0
Summarization datasets were created from the text bodies in the KAS 2.0 corpus (http://hdl.handle.net/11356/1448) and the abstracts from the KAS-Abs 2.0 corpus... -
English-Slovene term candidates KAS-biterm 1.0
KAS-biterm is an automatically generated glossary of English terms with their translations into Slovene. The pairs, possibly with their English and Slovene acronyms, were... -
Machine Translation datasets from the KAS corpus KAS-MT 1.0
The Machine Translation datasets KAS-MT 1.0 contain automatically sentence-aligned Slovene and English plain-text abstracts from KAS-Abs 2.0 (http://hdl.handle.net/11356/1449)... -
Abstracts from the KAS corpus KAS-Abs 2.0
The KAS-abs 2.0 corpus contains 125,202 automatically identified Slovenian and/or English abstracts from BSc/BA, MSc/MA, and PhD theses included in the KAS Corpus of Academic...