-
Constrained C-Test Generation via Mixed-Integer Programming (Supplementary Ma...
This work proposes a novel method to generate C-Tests; a deviated form of cloze tests (a gap filling exercise) where only the last part of a word is turned into a gap. In... -
Dataset for color terms, 2012
This dataset comprises adjective-noun phrases with color terms. -
AMR parse quality prediction [Source Code]
Accuracy prediction for AMR parsing predicts 33 accuracy metrics for a given sentence and its (automatic) AMR parse Abstract (Opitz and Frank, 2019): Semantic proto-role... -
NLP in Diagnostic Texts from Nephropathology [Research Data]
This data set contains all annotated topic word tables from the work "NLP in Diagnostic Texts from Nephropathology", as well as all pre-processed and tf-idf-vectorized text... -
WikiEvents Dataset from January 2020 to December 2022
WikiEvents is a knowledge graph based dataset for NLP and event-related machine learning tasks. This dataset includes RDF data in JSON-LD about events between January 2020 and... -
Propositional Claim Detection (NLP Datensatz)
Es handelt sich um einen natural language processing (NLP) Trainingsdatensatz. Modelle, die auf diesen Daten trainiert werden, sollen Behauptungen klassifizieren können, die... -
Combining text and vision in compound semantics: Towards a cognitively plausi...
In the current state-of-the art distributionalsemantics model of the meaning of noun-noun compounds (such aschainsaw, but-terfly, home phone),CAOSS(Marelli... -
Exploring Jiu-Jitsu Argumentation for Writing Peer Review Rebuttals
This is the resource for the dataset and models released as a part of our EMNLP 2023 paper "Exploring Jiu-Jitsu Argumentation for Writing Peer Review Rebuttals" -
Data Linking Workshop 2023: Computer Vision and Natural Language Processing –...
The humanities meet computer science to create new synergies using computer vision and natural language processing. Aim & Scope Historians are increasingly using... -
Data Linking Workshop 2023: Computer Vision and Natural Language Processing –...
The humanities meet computer science to create new synergies using computer vision and natural language processing. Aim & Scope Historians are increasingly using... -
Annotation Curricula to Implicitly Train Non-Expert Annotators
Annotation studies often require annotators to familiarize themselves with the task, its annotation scheme, and the data domain. This can be overwhelming in the beginning,... -
Wikipedia Discussion Corpora
Various annotated Wikipedia resources -
Opinion Mining Corpus on German Tweets about the Covid-19 Pandemic
The UKP Covid-19 Twitter Corpus includes 2,785 tweets annotated by student annotators and 200 expert-annotated tweets in German. Each tweet was annotated as either a supporting... -
From Zero to Hero: Human-In-The-Loop Entity Linking in Low Resource Domains
This dataset has no description
-
ChunkRel WS
ChunkRel-WS is a prototype service for recognition of three syntactic relations between chunks. The service may be run against plain text (input format: text), then the... -
Cinderella - tool for Clustering and Classifications of Texts in Polish
System for clustering and classifications of Texts in Polish. Source code. -
WebStylo
Web based, open stylometry system based on Multilevel Text Analysis. Runs cluto and stylo (R system) clusterisation methods. Based on Natural Language Processing Workflow... -
Chunker WS
Chunker-WS provides shallow parsing of Polish. The parser may be run against plain text (input format: text, then it runs WCRFT for tagging) or already tagged input (other input... -
Movie Title Puns
Context The data is based on the following paper on pun generation: Hämäläinen, M., & Alnajjar, K. (2019). Modelling the Socialization of Creative Agents in a... -
CorpusExplorer
Software for corpus linguists and text/data mining enthusiasts. The CorpusExplorer combines over 45 interactive visualizations under a user-friendly interface. Routine tasks...