-
Subset of KoLaS (Commented Learner Corpus Academic Writing), Plain Text Version
For this upload, all Word files (.doc and .docx) in the original KoLaS corpus were converted to plain text. For more information... -
Corpus of Academic Slovene (BSc/BA theses) KAS-dipl 1.0
The KAS-dipl corpus of Slovene BSc/BA theses consists of almost 65,000 texts (3,5 million pages or 1,1 billion tokens) written 2000 - 2018 and gathered from the digital... -
Corpus of Academic Slovene (MSc/MA theses) KAS-mag 1.0
The KAS-mag corpus of Slovene MSc/MA theses consists of almost 16,000 texts (1,360 thousand pages or 500 million tokens) written 2000 - 2018 and gathered from the digital... -
Corpus of Academic Slovene (PhD theses) KAS-dr 1.0
The KAS-dr corpus of Slovene PhD theses consists of almost 1,600 texts (266 thousand pages or 100 million tokens) written 2000 - 2018 and gathered from the digital libraries of... -
Corpus of academic Slovene KAS 1.0
The KAS corpus of Slovene academic writing consists of almost 65,000 BSc/BA, 16,000 MSc/MA and 1,600 PhD theses (82 thousand texts, 5 million pages or 1,7 billion tokens)... -
Abstracts from the KAS corpus KAS-Abs 2.0
The KAS-abs 2.0 corpus contains 125,202 automatically identified Slovenian and/or English abstracts from BSc/BA, MSc/MA, and PhD theses included in the KAS Corpus of Academic... -
English-Slovene term candidates KAS-biterm 1.0
KAS-biterm is an automatically generated glossary of English terms with their translations into Slovene. The pairs, possibly with their English and Slovene acronyms, were... -
Abstracts from the KAS corpus KAS-Abs 1.0
The KAS-abs corpus contains 108,254 automatically identified Slovenian and/or English abstracts (30 million words) from 62,000 BSc/BA, MSc/MA, and PhD theses included in the KAS... -
Summarization datasets from the KAS corpus KAS-Sum 1.0
Summarization datasets were created from the text bodies in the KAS 2.0 corpus (http://hdl.handle.net/11356/1448) and the abstracts from the KAS-Abs 2.0 corpus... -
Machine Translation datasets from the KAS corpus KAS-MT 1.0
The Machine Translation datasets KAS-MT 1.0 contain automatically sentence-aligned Slovene and English plain-text abstracts from KAS-Abs 2.0 (http://hdl.handle.net/11356/1449)... -
Corpus of academic Slovene KAS 2.0
The KAS corpus of Slovene academic writing consists of almost 65,000 BSc/BA, 16,000 MSc/MA and 1,600 PhD theses (82 thousand texts, 5 million pages or 1,5 billion tokens)... -
Czech Sociological Review 1993-2016
Selected research articles and essays published in Czech Sociological Review from 1993 to 2016. Originally Czech, non-translated material only. 522 documents in total. -
Beldeko Summary Corpus v1.1.0
Beldeko Summary Corpus v1.1.0 The Beldeko (Belgisches Deutschkorpus) Summary Corpus is a learner corpus that consists of summaries written by advanced L2 German learners (CEF... -
Beldeko Summary Corpus v1.0.0
Beldeko Summary Corpus v1.0.0 The Beldeko (Belgisches Deutschkorpus) Summary Corpus is a learner corpus that consists of summaries written by advanced L2 German learners (CEF... -
Commented Learner Corpus Academic Writing; Kommentiertes Lernendenkorpus akad...
Authentic texts written by students of the University of Hamburg as part of their studies, the students have various L1 languages and study various subjects, all of the texts... -
Commented Learner Corpus Academic Writing; Kommentiertes Lernendenkorpus akad...
Authentic texts written by students of the University of Hamburg as part of their studies, the students have various L1 languages and study various subjects, all of the texts... -
Commented Learner Corpus Academic Writing; Kommentiertes Lernendenkorpus akad...
Authentic texts written by students of the University of Hamburg as part of their studies, the students have various L1 languages and study various subjects, all of the texts... -
Commented Learner Corpus Academic Writing; Kommentiertes Lernendenkorpus akad...
Authentic texts written by students of the University of Hamburg as part of their studies, the students have various L1 languages and study various subjects, all of the texts... -
euroWiss - Linguistic Profiling of European Academic Education (Subcorpus 1) ...
Subcorpus 1 presents part of the euroWiss-Corpus covering communication in teaching/learning discourses in instruction at German and Italian universities, in the humanities as...