-
PolEmo 2.0 Sentiment Analysis Dataset for CoNLL
PolEmo 2.0: Corpus of Multi-Domain Consumer Reviews, evaluation data for article presented at CoNLL Citation: @inproceedings{kocon-etal-2019-multi, title = "Multi-Level... -
Word embeddings for Polish (KGR10, Fasttext binary) kgr10_fasttext_bin_v1
Distributional language model (binary) for Polish trained on KGR10 using Fasttext (vector dimension: 100). -
Khresmoi Summary Translation Test Data 2.0
This package contains data sets for development (Section dev) and testing (Section test) of machine translation of sentences from summaries of medical articles between Czech,... -
Khresmoi Query Translation Test Data 2.0
This package contains data sets for development and testing of machine translation of medical queries between Czech, English, French, German, Hungarian, Polish, Spanish ans... -
Grammatik des Polnischen
Das vorliegende Handbuch enthält eine umfassende Beschreibung der modernen polnischen Standardsprache und richtet sich an Lerner des Polnischen als Fremdsprache,... -
Replication Data for: Accusative of Negation in ‘Borderland’ Polish
These are the data for a journal article on 'Accusative of Negation in 'Borderland' Polish'. The abstract of the article is below. The data consist of the annotated list of... -
SimDiK
Data from the SimDiK project. -
Hamburg Corpus of Polish in Germany (HamCoPoliG)
This corpus version is deprecated for version 0.2. -
EXMARaLDA Demo corpus 1.1
A selection of short audio and video recordings in various languages to be used for instruction or demonstration of the EXMARaLDA system. The EXMARaLDA Demo Corpus is a small... -
Hamburg Corpus of Polish in Germany (HamCoPoliG)
Audio recordings of German/Polish bilingual and Polish monolingual adults (16-46 years). Recordings of semi-spontaneous data (3 topics) and renarration of a picture story. The... -
Hamburg Corpus of Polish in Germany (HamCoPoliG)
Original Data: Audio recordings of German/Polish bilingual and Polish monolingual adults (16-46 years). Recordings of semi-spontaneous data (3 topics) and renarration of a... -
Community Interpreting Database Pilot Corpus (ComInDat)
Audio and video recordings of various types of community interpreted discourse (doctor-patient communication, simulated doctor-patient communication, courtroom communication) in...