-
ENIAMtoolkit (2017-03-06)
ENIAMtoolkit is a collection of libraries that: - perform tokenization, lemmatization, part of speech tagging; - detect MWE and abbreviations; - split text into sentences; - LCG... -
The CLASSLA-Stanza model for UD dependency parsing of standard Bulgarian 2.1
The model for UD dependency parsing of standard Bulgarian was built with the CLASSLA-Stanza tool (https://github.com/clarinsi/classla) by training on the UD-parsed portion of... -
Training corpus SETimes.SR 1.0
The SETimes.SR training corpus contains 86 726 tokens manually annotated on the levels of tokenisation, sentence segmentation, morphosyntactic tagging, lemmatisation, syntactic... -
Training corpus ssj500k 2.3
The ssj500k training corpus contains about 500,000 tokens manually annotated on the levels of tokenisation, sentence segmentation, morphosyntactic tagging, and lemmatisation.... -
Trankit model for SST 2.15
This is a retrained Slovenian model for the Trankit v1.1.1 library for multilingual natural language processing (https://pypi.org/project/trankit/), trained on the SST treebank... -
The Trankit model for linguistic processing of written and spoken Slovenian 1.2
This is a retrained Slovenian model for the Trankit v1.1.1 library for multilingual natural language processing (https://pypi.org/project/trankit/), trained on the concatenation... -
The CLASSLA-Stanza model for UD dependency parsing of standard Croatian 2.1
The model for UD dependency parsing of standard Croatian was built with the CLASSLA-Stanza tool (https://github.com/clarinsi/classla) by training on the UD-parsed portion of the... -
Trankit model for SST 2.15 1.1
This is a retrained Slovenian model for the Trankit v1.1.1 library for multilingual natural language processing (https://pypi.org/project/trankit/), trained on the SST treebank... -
The CLASSLA-Stanza model for JOS dependency parsing of standard Slovenian 2.0
This model for JOS dependency parsing of standard Slovenian was built with the CLASSLA-Stanza tool (https://github.com/clarinsi/classla) by training on the SUK training corpus... -
Trankit model for linguistic processing of spoken Slovenian
This is a retrained Slovenian spoken language model for Trankit v1.1.1 library (https://pypi.org/project/trankit/). It is able to predict sentence segmentation, tokenization,... -
The Trankit model for linguistic process of standard written Slovenian 1.1
This is a retrained Slovenian model for the Trankit v1.1.1 library for multilingual natural language processing (https://pypi.org/project/trankit/), trained on the reference SSJ... -
The CLASSLA-Stanza model for UD dependency parsing of standard Slovenian 2.0
This model for UD dependency parsing of standard Slovenian was built with the CLASSLA-Stanza tool (https://github.com/clarinsi/classla) by training on the SUK training corpus... -
Training corpus SUK 1.0
The SUK training corpus contains about 1 million tokens manually annotated on the levels of tokenisation, sentence segmentation, morphosyntactic tagging, and lemmatisation, with... -
The CLASSLA-StanfordNLP model for UD dependency parsing of standard Croatian
The model for UD dependency parsing of standard Croatian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the... -
The Trankit model for linguistic processing of standard Slovenian
This is a retrained Slovenian standard model for Trankit v1.1.1 library (https://pypi.org/project/trankit/). It is able to predict sentence segmentation, tokenization,... -
The CLASSLA-Stanza model for UD dependency parsing of standard Serbian 2.1
The model for UD dependency parsing of standard Serbian was built with the CLASSLA-Stanza tool (https://github.com/clarinsi/classla) by training on the SETimes.SR training... -
Training corpus hr500k 1.0
The hr500k training corpus contains about 500,000 tokens manually annotated on the levels of tokenisation, sentence segmentation, morphosyntactic tagging, lemmatisation and... -
Training corpus SUK 1.1
The SUK training corpus contains about 1 million tokens manually annotated on the levels of tokenisation, sentence segmentation, morphosyntactic tagging, and lemmatisation, with... -
ReLDI tag+lemma+parse web service for WebLicht
WebLicht (https://weblicht.sfs.uni-tuebingen.de/) registry entry for webservice comprising tokenisation, PoS tagging, lemmatisation and dependency parsing. Tool source files... -
The CLASSLA-StanfordNLP model for UD dependency parsing of standard Slovenian
The model for UD dependency parsing of standard Slovenian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the...