4,412 datasets found

Filter Results
  • exerc corp

    Tworzenie korpusu na potrzeby warsztatów DSpace
  • KPWr annotation guidelines - named entities

    Named entities annotation guidelines describing the process of manual annotation of documents in Polish Corpus of Wrocław University of Technology (KPWr)
  • Liner2.5 rc3

    A framework for multitask sequence labeling dedicated for natural language processing tasks.
  • Verb in plWordNet 4.0 (Guidelines)

    The pdf document contains the guidelines of description of Verbs in the Polish part of plWordNet.
  • Blogs_2018

    Teksty z blogów książkowych
  • Acoustic Data Building Toolset

    This folder contains data and software tools (in python) that can be used in experiments with phoneme recognition in speech samples recorder in Polish. Acoustic data used here...
  • MWE Kuncewiczowa

    Maria Kuncewiczowa
  • Toki

    Toki is a configurable tokeniser, i.e. a module for segmentation of running text into tokens (word-like units) and sentences.
  • Novels_Dabrowska_Dzikie_ziele

    Body of Maria Dąbrowska "Wild herb" from the collection of the Scriptures selected. Stories, passages, dramas, songs for children.
  • CEN

    Corpus of Economic News (CEN) contains 797 documents from Polish Wikipedia annotated with 65 categories of proper names in ccl format....
  • Big data language model with part of speech tags stemmed in ARPA format

    Big data language model with part of speech tags stemmed in ARPA format
  • TaKIPI

    TaKIPI is a tagger of Polish language that is a tool which assigns morpho-syntactic markers to words in the text. The tagger assumes a morpho-syntactic description of IPI PAN...
  • WordnetLoom

    WordnetLoom – is an wordnet editor application built for the needs of the construction of a the largest Polish wordnet called plWordNet. WordnetLoom provides two means of...
  • NELexicon2

    NELexicon2 to rozszerzona wersją gazetteera nazw własnych, która zawiera ponad 2,3 miliona unikalnych napisów. NELexicon został wzmogacony o następujące zasoby: zdrobnienia...
  • MWE Korzeniowski

    Józef Korzeniowski
  • DG-POLFIE: POLFIE and Malt-based syntactic parser

    DG-POLFIE is a prototypical parser that tries to merge parse fragments generated by POLFIE using Polish Dependency Parser DG-POLFIE aims to improve the coverage of the POLFIE...
  • Elita władzy

    Elita władzy w województwach poznańskim i kaliskim za Zygmunta III
  • MACA

    Utilities are simple programs referencing the corresponding API functions, hence similar functionality may be easily obtained by using the libraries.
  • Periphraser

    Periphraser is a tool for storing and presenting knowledge base of conventionalized periphrastic nominal expressions (i.e. phrases headed by a noun) together with their...