SemSms - Semantic Database for Skolt Sami

PID

This SQLite database contains Skolt Sami lemmas and their frequencies in a big corpus. The lemmas are linked to each other based on the syntactic relations they have had in the corpus. Also, the frequency of a syntactic relation between two words is recorded. This means that it is possible to see how frequently for example the word for a dog has appeared with a subject relation with the verb for bark.

These database is translated from SemFi by using Giellatekno XML dictionaries.

For a detailed description of the structure, see https://www.kaggle.com/mikahama/semfi-finnish-semantics-with-syntactic-relations

An easy programmatic interface is provided in UralicNLP: https://github.com/mikahama/uralicNLP/wiki/Semantics-(SemFi,-SemUr)

Cite as

Hämäläinen, Mika. (2018). Extracting a Semantic Database with Syntactic Relations for Finnish to Boost Resources for Endangered Uralic Languages. In The Proceedings of Logic and Engineering of Natural Language Semantics 15 (LENLS15)

Identifier
PID http://hdl.handle.net/11304/775369f0-0a33-4ca1-8c5e-9debe29dc0a4
Metadata Access https://b2share.eudat.eu/api/oai2d?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:b2share.eudat.eu:b2rec/06f59dbf9ac3431ea06ff35e03531e4c
Provenance
Creator Hämäläinen, Mika
Publisher CLARIN
Publication Year 2020
Rights info:eu-repo/semantics/openAccess; CC BY 4.0
OpenAccess true
Representation
Discipline Linguistics