-
Sense Similarity
Sense and Similarity: A Study of Sense-level Similarity Measures -
Domain-specific context-sensitive semantic verb relations
This is a data set of semantic verb relations in English from the domain of everyday educational topics. The data set consists of 12403 pairs of propositions which have been... -
CLEVR-Hans7
A compositionally complex data set for investigating confounders and explainability. -
Hierarchy Identification
The page list data sets and experiments presented in the paper Hierarchy Identification for Automatically Generating Table-of-Contents. -
Quality Flaw Prediction in Wikipedia
Dataset to extract reliable training instances from Wikipedia -
German Relatedness Datasets
The datasets on this page were obtained by asking human subjects to assign a similarity or relatedness judgment to a number of German word pairs. The datasets have been used to... -
Difficulty Prediction for Language Tests
This collection includes various resources for predicting the difficulty of language proficiency tests. -
Context-Aware Representations for Knowledge Base Relation Extraction
We provide a subcorpus of Wikipedia that was annotated with Wikidata relations using a distant supervision procedure. The corpus contains two types of annotations: entities and... -
Insufficiently Supported Arguments in Argumentative Essays
This corpus includes 1029 arguments taken from argumentative essays. Each argument is annotated as “insufficient” if its premises do not provide enough evidence for accepting or... -
Supplementary materials: Mining Legal Arguments in Court Decisions
Pre-trained transformer models; accompanying materials to the paper and its GitHub repository -
CLEVR-Hans3
A compositionally complex data set for investigating confounders and explainability. -
Availability Test
How to set e-mail request access? This is the question that is hopefully answered with this dataset. -
Relation Classification
Semantic relatedness data -
Darmstadt Service Review Corpus
The Darmstadt Service Review Corpus consists of consumer reviews annotated with opinion related information at the sentence and expression levels. -
Multilingual UKP Sentential Argument Mining Corpus
This dataset is an extension of the original UKP Sentential Argument Mining Corpus which includes 25,492 sentences over eight controversial topics. Each sentence was annotated... -
From Zero to Hero: Human-In-The-Loop Entity Linking in Low Resource Domains
This dataset has no description
-
Email Disentanglement
Enron Threads Corpus and Enron Crowdsourced Dataset -
YAGS "Yahoo! Annotated Gold Standard"
This folder contains the data files and scripts to compile the YAGS "Yahoo! Annotated Gold Standard" annotated with FrameNet frames and roles as published. To compile the... -
Yeast Cells in Microstructures Dataset
Yeast cell instance-segmentation dataset of the paper "An Instance Segmentation Dataset of Yeast Cells in Microstructures" [EMBC, 2023]. https://arxiv.org/abs/2304.07597 We... -
Simple–complex Sentence Pairs
The simple–complex sentence pair dataset created in the paper.