The dataset represents the Twitter production in Slovenian in the period from 2018 until 2020. It consists of tweet IDs, retweet IDs, pseudo-anonymized user IDs, publication dates, and automatically assigned hate labels (acceptable, inappropriate, offensive, violent) with
The dataset is the basis for the two following papers:
- "Retweet communities reveal the main source of hate speech" -
- "Community evolution in retweet networks" -