Corpus of metaphorical expressions in spoken Slovene language G-KOMET 1.0


G-KOMET (a corpus of metaphorical expressions in spoken Slovene language) is an upgrade of the hand-annotated written corpus for metaphorical expressions KOMET ( with transcriptions of speech and conversation that covers 50,000 lexical units. The corpus contains samples from the Gos corpus of spoken Slovene ( and includes a balanced set of transcriptions of informative, educational, entertaining, private, and public discourse. It contains hand-annotated metaphor-related words, i.e. linguistic expressions that have the potential for people to interpret them as metaphors, idioms, i.e. multi-word units in which at least one word has been used metaphorically, and metonymies, expressions that we use to express something else.

The annotation scheme was based on the MIPVU metaphor identification process. This protocol was modified and adapted to the specifics of the Slovene language and the specifics of the spoken language. Corpus was annotated for the following relations to metaphor: indirect metaphor, direct metaphor, borderline cases and metaphor signals. In addition, the corpus introduces a new ‘frame’ tag, which gives information about a concept to which it refers. This conceptual frame allows us to search for figurative expressions within a specific context category (e.g. time, spatial orientation, emotions etc.). Metonymies were furthermore categorized based on the specific metonymic mapping. Corpus of metaphorical expressions in spoken Slovene language G-KOMET allows an objective and systematic analysis of metaphorical expressions, metaphors and metonymies in various Slovene texts.

Metadata Access
Creator Antloga, Špela; Donaj, Gregor
Publisher Faculty of Electrical Engineering and Computer Science, University of Maribor
Publication Year 2022
Rights Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0);; PUB
OpenAccess true
Contact info(at)
Language Slovenian; Slovene
Resource Type corpus
Format text/plain; charset=utf-8; application/zip; downloadable_files_count: 1
Discipline Linguistics