The DIPROMATS 2024 dataset is designed for the analysis of propaganda on social media, addressing both the techniques used and the underlying narratives. It contains tweets manually annotated in various languages and structured into three levels of analysis: binary detection of propaganda, classification into three groups of propaganda techniques, and detailed categorization into seven specific techniques. Additionally, it includes a multi-class, multi-label classification task to identify propaganda narratives associated with international actors.
Language(s)
Spanish
English
Dataset description link
Year
2024
Annotations
Each instance is assigned three labels: (i) propaganda or non-propaganda, (ii) group of propaganda techniques used, and (iii) specific techniques used.
Format
json
Data access
Register form
Publication
Moral et al. (2024). Overview of DIPROMATS 2024: Detection, Characterization and Tracking of Propaganda in Messages from Diplomats and Authorities of World Powers. Procesamiento del Lenguaje Natural, Revista, 73: 347-358.
NLP Topic
Number of units
9591
Type of units
Tweets
Training set size
6714
Test set size
2877