CLARIN - Repositories

exerc corp

Tworzenie korpusu na potrzeby warsztatów DSpace

KPWr annotation guidelines - named entities

Named entities annotation guidelines describing the process of manual annotation of documents in Polish Corpus of Wrocław University of Technology (KPWr)

Liner2.5 rc3

A framework for multitask sequence labeling dedicated for natural language processing tasks.

Verb in plWordNet 4.0 (Guidelines)

The pdf document contains the guidelines of description of Verbs in the Polish part of plWordNet.

Blogs_2018

Teksty z blogów książkowych

Acoustic Data Building Toolset

This folder contains data and software tools (in python) that can be used in experiments with phoneme recognition in speech samples recorder in Polish. Acoustic data used here...

MWE Kuncewiczowa

Maria Kuncewiczowa

Toki

Toki is a configurable tokeniser, i.e. a module for segmentation of running text into tokens (word-like units) and sentences.

Novels_Dabrowska_Dzikie_ziele

Body of Maria Dąbrowska "Wild herb" from the collection of the Scriptures selected. Stories, passages, dramas, songs for children.

CEN

Corpus of Economic News (CEN) contains 797 documents from Polish Wikipedia annotated with 65 categories of proper names in ccl format....

Big data language model with part of speech tags stemmed in ARPA format

TaKIPI

TaKIPI is a tagger of Polish language that is a tool which assigns morpho-syntactic markers to words in the text. The tagger assumes a morpho-syntactic description of IPI PAN...

Poliqarp for DjVu -a demonstration (open Virtual Appliance)

a server for DjVu corpora

WordnetLoom

WordnetLoom – is an wordnet editor application built for the needs of the construction of a the largest Polish wordnet called plWordNet. WordnetLoom provides two means of...

NELexicon2

NELexicon2 to rozszerzona wersją gazetteera nazw własnych, która zawiera ponad 2,3 miliona unikalnych napisów. NELexicon został wzmogacony o następujące zasoby: zdrobnienia...

MWE Korzeniowski

Józef Korzeniowski

DG-POLFIE: POLFIE and Malt-based syntactic parser

DG-POLFIE is a prototypical parser that tries to merge parse fragments generated by POLFIE using Polish Dependency Parser DG-POLFIE aims to improve the coverage of the POLFIE...

Elita władzy

Elita władzy w województwach poznańskim i kaliskim za Zygmunta III

MACA

Utilities are simple programs referencing the corresponding API functions, hence similar functionality may be easily obtained by using the libraries.

Periphraser

Periphraser is a tool for storing and presenting knowledge base of conventionalized periphrastic nominal expressions (i.e. phrases headed by a noun) together with their...

4,412 datasets found