-
MWE Kraszewski
Józef Ignacy Kraszewski -
Corpus of Russian Local Press of the Millennium Period (1996-2006)
Corpus of Russian Local Press of the Millennium Period (1996-2006): selected archives (borders - from 1995/1996-2006) of two hundred and eighty (280) local newspapers from... -
MWE Żuławski
Jerzy Żuławski -
MWE Marrene
Waleria Marrené-Morzkowska -
expose 1990-2014
expose MSZ 1990-2014 -
KPWr EVENTS (Attributes and Relations)
Documents from Polish Corpus of Wrocław University of Technology manually annotated with Attributes for EVENT instances and relations between EVENTS instances -
MWE Krzemieniecka
Hanna Krzemieniecka -
MWE Domańska
Antonina Domańska -
Cleaned Polish Oscar corpus (32M lines)
Cleaned Polish Oscar corpus (part: 32M lines, 3.35 GB). Data was prepared with a few cleaning heuristics: - remove sentences shorter than - remove non-polish sentences... -
Polish Spatial Texts (PST) 1.0
Texts derived from polish travel blogs manually annotated with spatial expressions, A spatial expression is a text fragment which describes a relative location of two or more... -
1000 Novels Corpus
Corpus of literary texts intended as benchmark collection for text categorization. It contains 1000 novels written in polish or translated to polish by various authors. Each... -
MWE Sienkiewicz
Henryk Sienkiewicz -
Big data language model stemmed with BPE in ARPA format
Big data language model stemmed with BPE in ARPA format -
MWE Kossak
Zofia Kossak -
Big data language model with part of speech tags stemmed in RAW format
Big data language model with part of speech tags stemmed in RAW format -
PolEmo 1.0 + MultiEmo-Test 1.0 Multilingual Sentiment Analysis Dataset for KE...
PolEmo 1.0 + MultiEmo-Test 1.0: Corpus of Multi-Domain Consumer Reviews. Test dataset from PolEmo 1.0 was translated to eight different languages: Dutch, English, French,... -
MWE Berent
Wacław Berent -
MWE Bęczkowska
Wanda Grot-Bęczkowska -
MWE Nałkowska
Zofia Nałkowska -
Big data language model stemmed in ARPA format
Big data language model stemmed in ARPA format.