-
German causal language annotations and lexicon (verbs, nouns, prepositions) (DE)
Annotations of causal verbs, nouns and prepositions in context and lexicon file for causal verbs, nouns and prepositions. -
Negative Sampling for Learning Knowledge Graph Embeddings
Reimplementation of four KG factorization methods and six negative sampling methods. Abstract Knowledge graphs are large, useful, but incomplete knowledge repositories. They... -
Topological Field Labeler for German
This resource contains the code of the topological labeler used in the paper: Do and Rehbein (2020). "Parsers Know Best: German PP Attachment Revisited". For this tool, labeling... -
Genre-sensitive Neural Situation Entity classifier (DE, EN)
This is a Classifier for situation entity types as described in Becker et al., 2017. These clause types depend on a combination of syntactic-semantic and contextual features. We... -
Pre-trained POS tagging models for German social media
Pre-trained POS tagging models for the HunPos tagger (Halácsy et al. 2007) the biLSTM-char-CRF tagger (Reimers & Gurevych 2017) Online-Flors (Yin et al. 2015).... -
ACL word segmentation correction
The data in this collection consists of two parallel directories, one ("raw") containing the raw text of 18850 articles from the ACL 2013/02 collection, the other... -
Encoder-Decoder Model for Semantic Role Labeling
Abstract (Daza & Frank 2019): We propose a Cross-lingual Encoder-Decoder model that simultaneously translates and generates sentences with Semantic Role Labeling annotations... -
AMR parse quality prediction [Source Code]
Accuracy prediction for AMR parsing predicts 33 accuracy metrics for a given sentence and its (automatic) AMR parse Abstract (Opitz and Frank, 2019): Semantic proto-role... -
tweeDe
A German UD Twitter treebank, with >12,000 tokens from 519 tweets, annotated in the Universal Dependencies framework -
Tool for Extracting PP Attachment Disambiguation Dataset
This resource contains code to extract a PP attachment disambiguation dataset as described in the paper: Do and Rehbein (2020). "Parsers Know Best: German PP Attachment... -
Affixoid Dataset (DE)
The dataset contains the manual annotations for the COLING 2018 submission "Distinguishing affixoid formations from compounds" by Josef Ruppenhofer, Michael Wiegand, Rebecca... -
3D Micro-Mapping of Subsidence Stations [Source Code and Data]
This dataset comprises the source code to reproduce the 3D micro-mapping tool for plane adjustment at subsidence stations. In this project, users adjust a plane (height and... -
Neural Rerankers for Dependency Parsing
This resource contains code for different types of neural rerankers (RCNN, RCNN-shared and GCN) from the paper: Do and Rehbein (2020). "Neural Reranking for Dependency Parsing:... -
Real-World PP Attachment Disambiguation Dataset
This resource contains a German dataset for real-world PP attachment disambiguation. The creation, analysis and experiment results of the dataset are described in the paper: Do... -
Lexicon of Abusive Words (EN)
This goldstandard contains a bootstrapped lexicon of abusive words. The lexicon comprises a large set of English negative polar expressions annotated as either abusive or not. -
Sentiment Compound Data (DE)
This dataset contains gold standards that are required for building a classifier that automatically extracts opinion (noun) compounds. -
A harmonised testsuite for social media POS tagging (DE)
A harmonised POS testsuite of web data, CMC and Twitter microtext, with word forms and STTS pos tags (+ some additional CMC-specific tags). UD pos tags have been automatically... -
Cataloging Cultural Objects (CCO) – The CCO Commons examples in VRA Core 4 XML
“Cataloging Cultural Objects - a Guide to Describing Cultural Works and Their Images” (CCO) provides a data content standard for catalogers of cultural heritage. It is a... -
DeModify
deModify consists of 3631 instances, each with three annotations obtained through CrowdFlower. An instance is a short story in which a modifier is annotated with respect to its... -
The MSC Data Set
From this page you can download resources we created for modal sense classification as reported in Zhou et al. (2015), Marasović et al. (2016) and Marasović and Frank (2015)...