FAUST 0.5

PID

Syntactic (including deep-syntactic - tectogrammatical) annotation of user-generated noisy sentences. The annotation was made on Czech-English and English-Czech Faust Dev/Test sets. The English data includes manual annotations of English reference translations of Czech source texts. This texts were translated independently by two translators. After some necessary cleanings, 1000 segments were randomly selected for manual annotation. Both the reference translations were annotated, which means 2000 annotated segments in total. The Czech data includes manual annotations of Czech reference translations of English source texts. This texts were translated independently by three translators. After some necessary cleanings, 1000 segments were randomly selected for manual annotation. All three reference translations were annotated, which means 3000 annotated segments in total.

Faust is part of PDT-C 1.0 (http://hdl.handle.net/11234/1-3185).

Identifier
PID http://hdl.handle.net/11234/1-3308
Related Identifier http://hdl.handle.net/11234/1-3185
Related Identifier https://arxiv.org/abs/2006.03679
Related Identifier https://ufal.mff.cuni.cz/grants/faust
Metadata Access http://lindat.mff.cuni.cz/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:lindat.mff.cuni.cz:11234/1-3308
Provenance
Creator Hajič, Jan; Mareček, David; Fučíková, Eva; Cinková, Silvie; Štěpánek, Jan; Mikulová, Marie
Publisher Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Publication Year 2011
Funding Reference info:eu-repo/grantAgreement/EC/FP7/247762
Rights Creative Commons - Attribution-NonCommercial 4.0 International (CC BY-NC 4.0); http://creativecommons.org/licenses/by-nc/4.0/; PUB
OpenAccess true
Contact lindat-help(at)ufal.mff.cuni.cz
Representation
Language English; Czech
Resource Type corpus
Format text/plain; charset=utf-8; application/zip; downloadable_files_count: 1
Discipline Linguistics