-
The CLASSLA-StanfordNLP model for UD dependency parsing of standard Slovenian
The model for UD dependency parsing of standard Slovenian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the... -
The CLASSLA-StanfordNLP model for UD dependency parsing of standard Bulgarian...
The model for UD dependency parsing of standard Bulgarian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the... -
Training corpus ssj500k 1.4
The ssj500k training corpus contains 500,000 words, manually annotated on the levels of tokenization, sentence segmentation, morphosyntactic tagging, lemmatisation, named... -
The CLASSLA-StanfordNLP model for JOS dependency parsing of standard Slovenia...
The model for JOS dependency parsing of standard Slovenian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the... -
The Trankit model for linguistic processing of spoken and written Slovenian 1.1
This is a retrained Slovenian model for the Trankit v1.1.1 library for multilingual natural language processing (https://pypi.org/project/trankit/), trained on the concatenation... -
Training corpus ssj500k 2.2
The ssj500k training corpus contains about 500,000 tokens manually annotated on the levels of tokenisation, sentence segmentation, morphosyntactic tagging, and lemmatisation.... -
Training corpus ssj500k 2.1
The ssj500k training corpus contains about 500,000 tokens manually annotated on the levels of tokenisation, sentence segmentation, morphosyntactic tagging, and lemmatisation.... -
Training corpus ssj500k 2.0
The ssj500k training corpus contains about 500,000 tokens manually annotated on the levels of tokenisation, sentence segmentation, morphosyntactic tagging, and lemmatisation.... -
The CLASSLA-StanfordNLP model for UD dependency parsing of standard Serbian
The model for UD dependency parsing of standard Serbian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the... -
Serbian linguistic training corpus SETimes.SR 2.0
The SETimes.SR training corpus contains around 100,000 tokens manually annotated on the levels of tokenisation, sentence segmentation, morphosyntactic tagging, lemmatisation,... -
Training corpus ssj500k 1.3
The ssj500k training corpus is based on two training corpora built within the JOS project (https://nl.ijs.si/jos/). It contains the jos100k corpus and additional material from... -
Croatian linguistic training corpus hr500k 2.0
The hr500k training corpus contains about 500,000 tokens manually annotated on the levels of tokenisation, sentence segmentation, morphosyntactic tagging, lemmatisation and... -
MSTperl delexicalized parser transfer scripts and configuration files
This is a set of MSTperl parser configuration files and scripts for delexicalized parser transfer. They were used in the work reported in arXiv:1506.04897... -
MSTperl parser
MSTperl is a Perl reimplementation of the MST parser of Ryan McDonald (http://www.seas.upenn.edu/~strctlrn/MSTParser/MSTParser.html). MST parser (Maximum Spanning Tree parser)... -
CoNLL 2017 and 2018 Shared Task Blind and Preprocessed Test Data
CoNLL 2017 and 2018 shared tasks: Multilingual Parsing from Raw Text to Universal Dependencies This package contains the test data in the form in which they ware presented to... -
Open SDP 1.2
The original SDP 2014 and 2015 data collections were made available under task-specific ‘evaluation’ licenses to registered SemEval participants. In mid-2016, all original data... -
MSTperl parser (2015-05-19)
MSTperl is a Perl reimplementation of the MST parser of Ryan McDonald (http://www.seas.upenn.edu/~strctlrn/MSTParser/MSTParser.html). MST parser (Maximum Spanning Tree parser)... -
Slavic Forest, Norwegian Wood (scripts)
Tools and scripts used to create the cross-lingual parsing models submitted to VarDial 2017 shared task (https://bitbucket.org/hy-crossNLP/vardial2017), as described in the... -
Slavic Forest, Norwegian Wood (models)
Trained models for UDPipe used to produce our final submission to the Vardial 2017 CLP shared task (https://bitbucket.org/hy-crossNLP/vardial2017). The SK model was trained on... -
Open SDP
The original SDP 2014 and 2015 data collections were made available under task-specific ‘evaluation’ licenses to registered SemEval participants. In mid-2016, all original data...