LiFR-Law. Corpus of Paraphrased Czech Administrative Texts with Reading Comprehension for Readability Studies

PID

LiFR-Law is a corpus of Czech legal and administrative texts with measured reading comprehension and a subjective expert annotation of diverse textual properties based on the Hamburg Comprehensibility Concept (Langer, Schulz von Thun, Tausch, 1974). It has been built as a pilot data set to explore the Linguistic Factors of Readability (hence the LiFR acronym) in Czech administrative and legal texts, modeling their correlation with actually observed reading comprehension. The corpus is comprised of 18 documents in total; that is, six different texts from the legal/administration domain, each in three versions: the original and two paraphrases. Each such document triple shares one reading-comprehension test administered to at least thirty readers of random gender, educational background, and age. The data set also captures basic demographic information about each reader, their familiarity with the topic, and their subjective assessment of the stylistic properties of the given document, roughly corresponding to the key text properties identified by the Hamburg Comprehensibility Concept.

Identifier
PID http://hdl.handle.net/11234/1-5020
Related Identifier http://hdl.handle.net/11234/1-5225
Related Identifier https://ufal.mff.cuni.cz/grants/lifr
Metadata Access http://lindat.mff.cuni.cz/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:lindat.mff.cuni.cz:11234/1-5020
Provenance
Creator Cinková, Silvie; Chromý, Jan; Šamánková, Jana; Hořeňovská, Karolína; Kettnerová, Václava; Kolářová, Veronika; Kubištová, Hana; Panevová, Jarmila
Publisher Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Publication Year 2023
Rights Creative Commons - Attribution 4.0 International (CC BY 4.0); http://creativecommons.org/licenses/by/4.0/; PUB
OpenAccess true
Contact lindat-help(at)ufal.mff.cuni.cz
Representation
Language Czech
Resource Type corpus
Format text/plain; charset=utf-8; application/zip; downloadable_files_count: 1
Discipline Linguistics