Segakorpus: Doktoritööd Corpus of Estonian scientific texts

PID

Korpus sisaldab 5 miljonit sõna eestikeelset teaduskirjandust: doktoritööd (2,3 miljonit sõna) ja teadusartiklid. TEI P5 XML märgendus, UTF8 kodeering. More info at http://www.cl.ut.ee/korpused/segakorpus/doktoritood/

A text corpus containing 5 million words of Estonian scientific texts: PhD dissertations (2.3 million words) and scientific articles. Markup: TEI P5 XML encoding: UTF8 More info at http://www.cl.ut.ee/korpused/segakorpus/doktoritood/

Identifier
PID http://hdl.handle.net/11297/1-00-0000-0000-0000-0002-4
Metadata Access https://metashare.ut.ee/oai_pmh/?verb=GetRecord&metadataPrefix=olac&identifier=c5f0fd7258e211e2a6e4005056b40024979168bd6780454f980729788272c9f2
Provenance
Publisher CLARIN
Contributor Kadri Muischnek, korpus.info[at]ut.ee
Publication Year 2022
Rights CLARIN_ACA-NC Restrictions of Use: academic-nonCommercialUse, attribution User Nature: academic
OpenAccess true
Contact info(at)keeleressursid.ee
Representation
Language Estonian
Resource Type Text
Size 5000000 words, 2 297 030 words
Discipline Linguistics