Dataset - B2FIND

PELCRA PARL corpus

The corpus comprises 50 sampled recordings (12 hours) and manual transcriptions (ca. 101 00 word tokens) of parliamentary data.
Mowa Wrocławia lat 80-tych - corpus

The corpus comprises spoken data collected in the 1980s in Wrocław. The data were retrieved from tapes and digitalised.
PELCRA EMO corpus

The corpus comprises 30 focused structured interviews (17 hours and ca. 200000 word tokens) centred on the topic of emotions. The corpus has bibliographic, morphosyntactic and...
PELCRA LUZ corpus

The corpus comprises 25 semi-scripted interviews (15 hours, ca. 165000 word tokens) with speakers of Polish on a range of topics.

You can also access this registry using the API (see API Docs).

4 datasets found