Dialogue act annotated spoken corpus GORDAN 1.0 (transcription)

PID

The GORDAN 1.0 corpus contains authentic data of spoken communication, annotated for dialogue acts according to the GORDAN 1.0 dialogue act annotation scheme, included in the data. The corpus data were selected from existing Slovene speech corpora: GOS (http://hdl.handle.net/11356/1040), Gos Videolectures (http://hdl.handle.net/11356/1223) and BERTA. Four criteria were taken into account in the selection: public/non-public, interactive/monologic, channel and intention. The total length of the data is 1 hour of recordings (6,909 words). The selected data were annotated using the Transcriber 1.5.1 tool and its function Event. Annotation was done based on multimodal data, listening to the audio or watching the video recording, where available.

This resource contains only annotated transcriptions of the corpus – audio and video recordings are available at http://hdl.handle.net/11356/1292.

Identifier
PID http://hdl.handle.net/11356/1291
Metadata Access http://www.clarin.si/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:www.clarin.si:11356/1291
Provenance
Creator Verdonik, Darinka
Publisher Faculty of Electrical Engineering and Computer Science, University of Maribor
Publication Year 2020
Rights Creative Commons - Attribution 4.0 International (CC BY 4.0); https://creativecommons.org/licenses/by/4.0/; PUB
OpenAccess true
Contact info(at)clarin.si
Representation
Language Slovenian; Slovene
Resource Type corpus
Format application/zip; text/plain; charset=utf-8; downloadable_files_count: 1
Discipline Linguistics