Dataset - B2FIND

Big data language model stemmed with BPE in ARPA format

POLFIE Bank, an LFG structure bank of Polish: pol-nkjp1m-pargram-dev

The pol-nkjp1m-pargram-dev structure bank was created using POLFIE: an LFG grammar of Polish. This structure bank contains sentences from the NKJP1M subcorpus of NKJP which were...

Big data language model with part of speech tags stemmed in RAW format

Polish-Lithuanian Parallel Corpus "2"

New upgraded version of the Polish-Lithuanian Parallel Corpus (http://hdl.handle.net/11321/309) with extra files and features (Including General, Medical, Technical, Legal,...

Big Data language model - subword - SYLLABED - RAW

Big data language model based on syllabes in RAW format

POLFIE Bank, an LFG structure bank of Polish: pol-składnica-pargram

The pol-składnica-pargram structure bank was created using POLFIE: an LFG grammar of Polish. This structure bank contains FULL type sentences from Składnica, which were in turn...

Big data language model stemmed in ARPA format

Big data language model stemmed in ARPA format.

Polish Parliamentary Corpus

The Polish Parliamentary Corpus (PPC) is a large collection of linguistically analysed documents from the proceedings of Polish Parliament, Sejm and Senate. The corpus files are...

Big Data language model - subword - BPE - RAW

Big data language model based on subword units, based on byte pair encoding in RAW format

Dependency parsing models for Polish

PDB-based parsing models are trained on the current version of Polish Depedency Bank with the publicly available parsing systems: MaltParser, MateParser, and UDPipe.

Polish-Bulgarian Parallel Corpus

Big data language model with part of speech tags stemmed in ARPA format

Chunker WS

Chunker-WS provides shallow parsing of Polish. The parser may be run against plain text (input format: text, then it runs WCRFT for tagging) or already tagged input (other input...

MorphoDiTa-based tagger for Polish language

MorphoDiTa-based tagger for Polish language. It is a tool for morphosyntactic unification for the Polish language, according to the NKJP tagset.

Krokodyl: A hybrid depencency parser of Polish

Krokodyl is an experimental hybrid deep depencency parser of Polish. Krokodyl has been developed at the Institute of Computer Science, Polish Academy of Sciences (IPI PAN)...

Smyrna

Smyrna is a tool for building and searching own Polish corpora from HTML files.

POLFIE: an LFG grammar of Polish

POLFIE is an LFG grammar of Polish implemented in the XLE system (Xerox Linguistic Environment). POLFIE has been developed at the Institute of Computer Science, Polish Academy...

Polish-Russian Parallel Corpus

ChunkRel WS

ChunkRel-WS is a prototype service for recognition of three syntactic relations between chunks. The service may be run against plain text (input format: text), then the...

Polish-Lithuanian Parallel Corpus

Database

52 datasets found