-
Ordlista för plan- och byggtermer
Multilingual glossary concerning planning and building terms. The glossary was produced by the Swedish Centre for Terminology TNC in close cooperation with the Swedish National... -
Skogsordlista
Multilingual glossary containing terms related to forestry. The glossary has been produced in cooperation with the Swedish Forest association. The glossary contains 3512 terms... -
Geoteknisk ordlista
Multilingual terminology glossary containing terms from the geotechnical field. The glossary has been created by the Swedish Centre for Terminology (TNC) in close cooperation... -
Geologisk ordlista
Multilingual terminology glossary containing terms from the geological field. The glossary contains 1745 terms in Swedish, some of the terms have been translated to Russian,... -
Data on Terminological Semantic Variation between the (US and British) Press ...
The data set contains three spreadsheets, two of them being displayed in one single Excel file. The first file, entitled « Cosine_Similarity_UN-Press », represents the cosine... -
Corpus of term-annotated texts RSDO5 1.1
The RSDO5 corpus was compiled in order to serve as a training set for automatic term identification. It consists of 12 texts with 250,000 words and almost 38,000 manually... -
Corpus of Academic Slovene (BSc/BA theses) KAS-dipl 1.0
The KAS-dipl corpus of Slovene BSc/BA theses consists of almost 65,000 texts (3,5 million pages or 1,1 billion tokens) written 2000 - 2018 and gathered from the digital... -
Corpus of Academic Slovene (MSc/MA theses) KAS-mag 1.0
The KAS-mag corpus of Slovene MSc/MA theses consists of almost 16,000 texts (1,360 thousand pages or 500 million tokens) written 2000 - 2018 and gathered from the digital... -
Corpus of Academic Slovene (PhD theses) KAS-dr 1.0
The KAS-dr corpus of Slovene PhD theses consists of almost 1,600 texts (266 thousand pages or 100 million tokens) written 2000 - 2018 and gathered from the digital libraries of... -
Corpus of academic Slovene KAS 1.0
The KAS corpus of Slovene academic writing consists of almost 65,000 BSc/BA, 16,000 MSc/MA and 1,600 PhD theses (82 thousand texts, 5 million pages or 1,7 billion tokens)... -
Corpus of Informatics DSI 5.0
The DSI corpus is meant as a terminological resource for the field of informatics, esp. for the development of the on-line terminological dictionary of informatics, Islovar... -
Pan-Latin Geothermal Energy Lexicon
The Pan-Latin Geothermal Energy Lexicon (Lessico panlatino dell’energia geotermica), developed within the Realiter network, contains the basic terms related to geothermal energy... -
Terminological multiword expressions lexicon
The Terminological Multiword Expressions Lexicon contains multiword terms extracted from various terminological sources. The entries were lemmatized and tagged according to the... -
ACTER (Annotated Corpora for Term Extraction Research) v1.5
ACTER (Annotated Corpora for Term Extraction Research) is a manually annotated dataset for term extraction, covering 3 languages (English, French, and Dutch), and 4 domains... -
ACTER (Annotated Corpora for Term Extraction Research) v1.4
The ACTER (Annotated Corpora for Term Extraction Research) is an annotated dataset for term extraction. Terms and Named Entities have been manually annotated in specialised... -
ACTER (Annotated Corpora for Term Extraction Research) v1.3
The ACTER (Annotated Corpora for Term Extraction Research) is an annotated dataset for term extraction. Terms and Named Entities have been manually annotated in specialised... -
Bilingual terminology extraction dataset KAS-biterm 1.0
The KAS-biterm bilingual term extraction dataset contains complete sentences selected from PhD theses from the KAS corpus of Slovene academic writing. Only sentences that have a... -
TermFrame: Terms, definitions and semantic annotations for karstology
The resource contains several datasets containing domain-specific data in three languages, English, Slovenian and Croatian, which can be used for various knowledge extraction or... -
English-Slovene term candidates KAS-biterm 1.0
KAS-biterm is an automatically generated glossary of English terms with their translations into Slovene. The pairs, possibly with their English and Slovene acronyms, were... -
English-Slovenian chess terminology database 1.0
The English-Slovenian chess terminology database (termbase) is a bilingual database containing 82 English and 109 Slovenian chess terms, which make up 77 entries. The terms are...