The Danish Parliament Corpus 2009 - 2017, v2, w. subject annotation

PID

The Danish Parliament Corpus 2009 - 2017, v2, w. subject area annotation contains transcripts of parliamentary speeches of the Danish Parliament, Folketinget, session 20091 to 20161 (6/10 2009 – 7/9 2017) downloaded from the Danish Parliaments ftp server: ftp://oda.ft.dk. The corpus has extensive metadata about the MPs (name, gender, age, role, title, party affiliation), timing of the speeches and subject area annotation of each agenda item. The information on age and gender was added from external sources and the subject area annotation was semiautomatically added to each speech on the basis of manual annotation of the agenda titles. The corpus is organized into tab separated txt-files, one filer per meeting, one zip-file per session. The Danish Parliament Corpus 2009 - 2017 follows the license for Open Data stating the following: "The Danish Parliament grants a world-wide, free, non-exclusive and otherwise unrestricted right of use of the data in the Danish Parliament's open data catalogue. The data can be freely: • copied, distributed and published, • adapted and combined with other material, • exploited commercially and non-commercially. " Following the copyright act, the speeches can be distributed without the consent of the speaker but only in a way where the author/speaker of each text/speech is clearly stated. Furthermore, the Danish Parliament must be acknowledged as the source.

Identifier
PID http://hdl.handle.net/20.500.12115/44
Related Identifier http://lrec-conf.org/workshops/lrec2018/W2/pdf/3_W2.pdf
Related Identifier http://ceur-ws.org/Vol-2364/15_paper.pdf
Metadata Access http://repository.clarin.dk/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:repository.clarin.dk:20.500.12115/44
Provenance
Creator Hansen, Dorte Haltrup; Navarretta, Costanza
Publisher Centre for Language Technology, NorS, University of Copenhagen
Publication Year 2021
Rights Creative Commons - Attribution 4.0 International (CC BY 4.0); PUB; http://creativecommons.org/licenses/by/4.0/
OpenAccess true
Contact info(at)clarin.dk
Representation
Language Danish
Resource Type corpus
Format text/plain; application/vnd.openxmlformats-officedocument.spreadsheetml.sheet; application/zip; text/plain; charset=utf-8; downloadable_files_count: 13
Discipline Linguistics