-
Finnish Semantic Relatedness Model
This model is a semantic model that captures the relatedness of Finnish words as word vectors. This model can be used in various tasks such as metaphor interpretation. For... -
Lithuanian Word embeddings
GloVe type word vectors (embeddings) for Lithuanian. Delfi.lt corpus (~70 million words) and StanfordNLP were used for training. The training consisted of several stages: 1)... -
LitLat BERT
Trilingual BERT-like (Bidirectional Encoder Representations from Transformers) model, trained on Lithuanian, Latvian, and English data. State of the art tool representing... -
PoLitBert_v32k_cos1_2_50k - Polish RoBERTa model
Polish RoBERTa model trained on Polish Wikipedia, Polish literature and Oscar. -
KGR10 FastText Polish word embeddings
Distributional language model (both textual and binary) for Polish (word embeddings) trained on KGR10 corpus (over 4 billion of words) using Fasttext with the following variants... -
PoLitBert_v50k_linear_50k - Polish RoBERTa model
Polish RoBERTa model trained on Polish Wikipedia, Polish literature and Oscar. -
PoLitBert_v32k_cos1_5_50k - Polish RoBERTa model
Polish RoBERTa model trained on Polish Wikipedia, Polish literature and Oscar. -
PoLitBert_v32k_tri_50k - Polish RoBERTa model
Polish RoBERTa model trained on Polish Wikipedia, Polish literature and Oscar. -
EWBST tests for english
Submission contains test generated for EWBST test of English word embedding models. Tests were created with princeton wordnet and plWN english synsts. -
PoLitBert_v32k_linear_50k - Polish RoBERTa model
Polish RoBERTa model trained on Polish Wikipedia, Polish literature and Oscar. -
PoLitBert_v32k_tri_125k - Polish RoBERTa model
Polish RoBERTa model trained on Polish Wikipedia, Polish literature and Oscar. -
PoLitBert_v32k_linear_125k - Polish RoBERTa model
Polish RoBERTa model trained on Polish Wikipedia, Polish literature and Oscar. -
TimeAssign
TimeAssign is a program which recognizes temporal expressions and assigns TimeML labels to words in Polish text using a Bi-LSTM based neural net and wordform embeddings. -
Word embeddings for Polish (KGR10, Fasttext binary) kgr10_fasttext_bin_v1
Distributional language model (binary) for Polish trained on KGR10 using Fasttext (vector dimension: 100). -
Word embeddings CLARIN.SI-embed.sr 2.0
CLARIN.SI-embed.sr contains word embeddings induced from the srWaC and MaCoCu-sr web corpora. The embeddings are based on the skip-gram model of fastText trained on... -
ELMo embeddings models for seven languages
ELMo language model (https://github.com/allenai/bilm-tf) used to produce contextual word embeddings, trained on large monolingual corpora for 7 languages: Slovenian, Croatian,... -
Word embeddings CLARIN.SI-embed.sl 2.0
CLARIN.SI-embed.sl contains word embeddings induced from a large collection of Slovene texts composed of existing corpora of Slovene, e.g GigaFida, Janes, KAS, slWaC, MaCoCu-sl,... -
Word embeddings CLARIN.SI-embed.bg 1.0
CLARIN.SI-embed.bg contains word embeddings for Bulgarian induced from the MaCoCu-bg web crawl corpus (http://hdl.handle.net/11356/1515). The embeddings are based on the... -
Word embeddings CLARIN.SI-embed.hr 1.0
CLARIN.SI-embed.hr contains word embeddings induced from a large collection of Croatian texts composed of the Croatian web corpus hrWaC and a 400-million-token-heavy collection... -
CroSloEngual BERT 1.1
Trilingual BERT (Bidirectional Encoder Representations from Transformers) model, trained on Croatian, Slovenian, and English data. State of the art tool representing...