-
MariTerm v.1.2
This is an enriched version of the MariTerm maritime ontology, containing plug-ins to correpsonding synsets inside IWN. The resource was created within the collaboration of the... -
Survey Data on Preferences of Lithuanian Cybersecurity Terminology
The data is provided in two files: one containing questionnaire-data and the other containing the respondentents' data. The questionnaire data is in a TXT file, which includes... -
TED-ELH Parallel Corpus
The corpus contains parallelly aligned scripts of TED Talks in English, Lithuanian, and Hebrew. It contains spoken language data. -
Lithuanian Treebank ALKSNIS (2019-10-24)
ALKSNIS v3.0. ALKSNIS v3,0 consists of 3,643 syntactically annotated sentences in the PML (Prague Mark-up Language) format. The format allows researchers to visualise and edit... -
JABLONSKIS tagset v2
JABLONSKIS VERSION 2 is a Lithuanian standard morphologiclal tagset that is based on the abbreviations of parts of speech and other grammatical categories commonly used in... -
Language Technology Research Bibliography for Lithuanian 2016-2020
The language technology bibliography for Lithuanian language in the period 2016-2020. The resource is in BibTex format and it contains: 1) 91 references of research... -
ORVELIT v3
ORVELIT v3 (Lith.Originalios ir Vertimų Lietuvių Kalbos Tekstynas) is a comparable monolingual corpus of original and translated Lithuanian consisting of four sub-corpora of... -
Lithuanian morphologically annotated corpus - MATAS
MATAS v0.2 - Morphologically Annotated Lithuanian Corpus (manually checked) Contains 4 parts: Documents (21%), Fiction (19%), Periodicals (36%), Scientific texts (24%) Wordform... -
Lithuanian keyboard for macOS users
This keyboard driver allows easy access of the Lithuanian letters via conventional keyboard layout a.k.a. „Lithuanian letters instead of numbers“. Essential new feature of this... -
Corpus of the Contemporary Lithuanian Language
Corpus of the Contemporary Lithuanian Language, which comprises 208 million words, is a collection of texts designed to represent the current Lithuanian. The corpus has been... -
Colloc -- A Tool for Automatic Identification of Multiword Expressions
Colloc -- a tool for automatic identification of multiword expressions (MWE) is freely available for online use at http://resursai.mwe.lt/atpazintuvas. As material for training... -
Pedagogic Corpus of Lithuanian
The Pedagogic Corpus of Lithuanian is a monolingual specialized corpus, prepared for learning and teaching Lithuanian in a foreign language classroom. The pedagogic corpus... -
English-Lithuanian Parallel Cybersecurity Corpus - DVITAS v2.0
English-Lithuanian parallel corpus DVITAS v2 includes original English texts on cybersecurity and their Lithuanian translations aligned on the sentence level. Version 1 of the... -
DIGIRES COVID-19 ML Dataset v.1
DIGIRES COVID-19 ML dataset v.1 is a tab-separated (.tsv) file prepared for training machine learning algorithms. The training dataset was compiled from various internet public... -
Lemmatised Wordlist of 1 m. Corpus of Contemporary Lithuanian
The lemmatised wordlist of 1 m. word Lithuanian corpus. The structure of the tab delimited text file (dazninis.txt): HeadwordPart of SpeechWordformFrequency of Occurrence. The... -
Lithuanian speech-to-text Transcriber
Speech to text automatic transcriber for Lithuanian is a containerized application implemented into 17 containers. It covers four areas: administrative, legal, medical and... -
English-Lithuanian Parallel Cybersecurity Corpus - DVITAS
English-Lithuanian parallel corpus DVITAS includes original English texts on cybersecurity and their Lithuanian translations aligned on the sentence level. The corpus was... -
Read Speech Corpus (7G)
The corpus of read Lithuanian speech „7G“ was compiled in 2015-2016. The corpus consists of 352 audio recordings with a total duration of over 7 hours. Seven different speakers... -
Corpus of Discourse on Crime
Specialised "Corpus of Discourse on Crime" is synchronic, monolingual, unannotated, consists of two subcorpora. Subcorpus 1: all texts on crime, published in criminal columns on... -
The Scottish Gaelic Linguistic Toolkit
A linguistic analyser for tagging, lemmatisation and parsing of Scottish Gaelic texts. Morphological and syntactic analyses are available directly from the webpage (through the...