4 datasets found

Keywords: POS tagged

Filter Results
  • Corpus of the Contemporary Lithuanian Language

    Corpus of the Contemporary Lithuanian Language, which comprises 208 million words, is a collection of texts designed to represent the current Lithuanian. The corpus has been...
  • Lithuanian morphologically annotated corpus - MATAS v1.0

    MATAS corpus (version 1.0) DESCRIPTION Manually checked, morphologically annotated corpus MATAS FORMATS 1. CoNLL-U (CONLLU, conllu) 2. SketchEngine - tab delimited word per...
  • DELFI.lt corpus

    DELFI.lt is corpus made of articles published by news portal DELFI.lt since March 2014 till November 2016. Metadata was collected with articles as well: author, title, date,...
  • Lithuanian morphologically annotated corpus - MATAS

    MATAS v0.2 - Morphologically Annotated Lithuanian Corpus (manually checked) Contains 4 parts: Documents (21%), Fiction (19%), Periodicals (36%), Scientific texts (24%) Wordform...
You can also access this registry using the API (see API Docs).