-
Manometer Validation Set
The validation set used in the paper Industrial Manometer Detection and Reading for Autonomous Inspection Robots presented at ECMR 2021. The dataset contains (augmented) images... -
Whittle Networks datasets
Datasets for paper "Whittle Networks: A Deep Likelihood Model for Time Series." -
DIP-SumEval: A Data Set of Human Summary Evaluations
This repository contains the summaries and evaluations from the paper 'A Dataset for the Analysis of Text Quality Dimensions in Summarization Evaluation' presented at LREC 2020.... -
Medical Concept Embeddings via Labeled Background Corpora
This entry contains the resources used in and resulting from Eneldo Loza Mencía, Gerard de Melo and Jinseok Nam, Medical Concept Embeddings via Labeled Background Corpora, in:... -
Concept Map Summaries
This benchmark corpus for concept-map-based multi-document summariziation was introduced in the following publication: Tobias Falke and Iryna Gurevych. Bringing Structure into... -
Subjective Verbs Lexicons
The English lexicons on this page originate from the works below: * Biber, D., Johansson, S., Leech, G., Conrad, S., and Finegan, E. (1999). Longman Grammar of Spoken and... -
Actor-critic Instance Segmentation
Most approaches to visual scene analysis have emphasised parallel processing of the image elements. However, one area in which the sequential nature of vision is apparent, is... -
UKP Sentential Argument Mining Corpus
The UKP Sentential Argument Mining Corpus includes 25,492 sentences over eight controversial topics. Each sentence was annotated via crowdsourcing as either a supporting... -
Wikipedia Text Segmentation
For corpus generation, we extracted top-level sections of featured articles and concatenated their textual contents to a pure-text corpus file. The content of a section is... -
Spelling Difficulty Prediction
Extracted spelling errors from various corpora. -
BWS Argument Similarity Corpus
The BWS Argument Similarity Corpus includes 3,400 sentence pairs for 8 controversial topics with 425 argument pairs each for every topic. Each argument-pair was annotated via... -
Re-rating Studies
A Reflective View on Text Similarity -
Predictive Whittle Networks for Time Series
Dataset for paper "Predictive Whittle Networks for Time Series" Use with code at: https://github.com/ml-research/PWN -
OSS-Net trained models
Trained OSS-Net models of the paper "OSS-Net: Memory Efficient High Resolution Semantic Segmentation of 3D Medical Data". -
Wikipedia Article Feedback
The corpus lists article IDs of biographies of living and dead people, rated as above average or below average along four categories (trustowrthy, objective, well written,... -
Text Reuse Annotations
Text Reuse Detection Using a Composition of Text Similarity Measures -
German-English Modality Verbclasses
This is a semantic classification of more than 600 German lexical verbs and their English translation introduced in the paper: Judith Eckle-Kohler. Verbs Taking Clausal and... -
Turk Bootstrap Word Sense Inventory (TWSI) 2.0
Turk Bootstrap Word Sense Inventory (TWSI) 2.0. This lexical resource, created by a crowdsourcing process using Amazon Mechanical Turk (http://www.mturk.com), encompasses a... -
EUR-Lex Dataset
The EUR-Lex text collection is a collection of documents about European Union law. It contains many different types of documents, including treaties, legislation, case-law and... -
Wikipedia Edit Category Corpus
For the corpus itself, please refer to/cite: Johannes Daxenberger and Iryna Gurevych (2012). "A Corpus-Based Study of Edit Categories in Featured and Non-Featured Wikipedia...