Semantic hypergraph corpus SemCRO 1.0

PID

This corpus can be used to build and evaluate methods for extracting and presenting knowledge based on a semantic hypergraph. The corpus consists of 184 simple, complex and dependently complex sentences. All sentences are marked on the levels of tokenisation, sentence segmentation, morphosyntactic tagging, lemmatisation, syntactic dependencies, named entities, and semantic roles. This resource also includes, a representation of a subset of 176 sentences in the form of a semantic hypergraph that can be used to evaluate knowledge extraction methods for Croatian. The sentences used in this corpora are taken from the textbook:

Hudeček, L., Mihaljević, M., Sršen, J. and Čamagajevac, S. (2017). Hrvatska Školska Gramatika. Zagreb: Institut za hrvatski jezik i jezikoslovlje. https://gramatika.hr/impresum/

Identifier
PID http://hdl.handle.net/11356/1377
Related Identifier https://www.acnltutor.net/
Metadata Access http://www.clarin.si/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:www.clarin.si:11356/1377
Provenance
Creator Vasić, Daniel; Žitko, Branko; Gašpar, Angelina; Ljubešić, Nikola; Štrkalj Despot, Kristina; Merkler, Danijela
Publisher University of Mostar; University of Split; Jožef Stefan Institute
Publication Year 2020
Rights Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0); https://creativecommons.org/licenses/by-sa/4.0/; PUB
OpenAccess true
Contact info(at)clarin.si
Representation
Language Croatian
Resource Type corpus
Format application/zip; text/plain; charset=utf-8; downloadable_files_count: 1
Discipline Linguistics