The GORDAN 1.0 corpus contains authentic data of spoken communication, annotated for dialogue acts according to the GORDAN 1.0 dialogue act annotation scheme, included in the data. The corpus data were selected from existing Slovene speech corpora: GOS (http://hdl.handle.net/11356/1040), Gos Videolectures (http://hdl.handle.net/11356/1223) and BERTA. Four criteria were taken into account in the selection: public/non-public, interactive/monologic, channel and intention. The total length of the data is 1 hour of recordings (6,909 words). The selected data were annotated using the Transcriber 1.5.1 tool and its function Event. Annotation was done based on multimodal data, listening to the audio or watching the video recording, where available.
This resource contains only annotated transcriptions of the corpus – audio and video recordings are available at http://hdl.handle.net/11356/1292.