33 datasets found

Keywords: Twitter

Filter Results
  • CMC shortening corpus Janes-Kratko 1.0

    Janes-Kratko is a corpus of Slovene tweets manually annotated with shortening phenomena according to the supplied typology covering different types of spelling, lexical and...
  • xLiMe Twitter Corpus XTC 1.0.1

    The xLiMe Twitter Corpus contains tweets in German, Italian and Spanish manually annotated with part-of-speech, named entities, and message-level sentiment polarity. In total,...
  • Slovenian Twitter dataset 2018-2020 1.0

    The dataset represents the Twitter production in Slovenian in the period from 2018 until 2020. It consists of tweet IDs, retweet IDs, pseudo-anonymized user IDs, publication...
  • The Twitter user dataset for discriminating between Bosnian, Croatian, Monten...

    The Twitter-HBS dataset consists of Twitter users, their tweets, and the label of their predominantly used language - Bosnian, Croatian, Montenegrin, or Serbian. Among the...
  • Slovenian Twitter hate speech dataset IMSyPP-sl

    A hand-labeled training (50,000 tweets labeled twice) and evaluation set (10,000 tweets labeled twice) for hate speech on Slovenian Twitter. The data files contain tweet IDs,...
  • Dataset of European Parliament roll-call votes and Twitter activities MEP 1.0

    The resource consists of two datasets related to Members of the 8th European Parliament (MEPs). The first one is a dataset of 2,535 roll-call votes of MEPs until 2016-03-01. The...
  • Brexit stance annotated tweets

    The corpus contains over 4.5 million tweets (tweet IDs) automatically labeled by a machine learning program with stance regarding Brexit: Positive (supporting Brexit), Negative...
  • Twitter sentiment for 15 European languages

    The dataset contains over 1.6 million tweets (tweet IDs), labeled with sentiment by human annotators. There are 15 Twitter corpora for the corresponding 15 European languages....
  • Tweets about impact investing

    The corpus contains 668,529 tweets (tweet IDs) relevant to "impact investing", accompanied by sentiment labels given by an automated sentiment classifier. Impact investing...
  • Dictionary of Twitterese Janes-Dict 1.0

    The Dictionary of Twitterese 1.0 is the first attempt at a lexicographic description of non-standard Slovene as found on Twitter. Version 1.0 contains 1,002 entries, of which...
  • CorpusExplorer

    Software for corpus linguists and text/data mining enthusiasts. The CorpusExplorer combines over 45 interactive visualizations under a user-friendly interface. Routine tasks...
  • Opinio

    Twitter data corpus from, on the one hand, French-speaking Belgian political accounts and, on the other hand, a sample of accounts from the French-speaking Belgian population....
  • Under His Thumb. The Effect of President Donald Trump's Twitter Messages on t...

    Does president Trump’s use of Twitter affect financial markets? The president frequently mentions companies in his tweets and, as such, tries to gain leverage over their...
You can also access this registry using the API (see API Docs).