-
Szeged Corpus 1.0
written, monolingual, general, manually POS annotated reference corpus; 1,247,546 tokens; MSD tagset, XML (TEIxLite) files -
Szeged Corpus 2.0
written, monolingual, general, manually POS annotated reference corpus; 1,459,288 tokens; MSD tagset, XML (TEI P4) files -
Copenhagen Dependency Treebanks versions 1-3
Parallel treebanks with annotation of syntax, discourse, coreference, morphology, and semantics. Version 3 also includes the Danish Dependency Treebank (version 1) and the...