EXIST-2023-ES

The EXIST 2023 Spanish corpus is a collection of  tweets labelled with information related to sexism: whether the tweet is sexist, the type of  intention of the author of the tweet shows and the type of sexism that is being exerted.

Language(s)
Spanish
English
Year
2023
Domain
Social
Text types
Tweets
Annotations
binary label indicating whether a tweet expresses sexism, multiclass lables about the type of sexism and the intention of the author
Format
json
Data access
Registration

Publication
Plaza, L. et al. (2023). Overview of EXIST 2023 – Learning with Disagreement for Sexism Identification and Characterization. In: Arampatzis, A., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2023. Lecture Notes in Computer Science, vol 14163. Springer, Cham. https://doi.org/10.1007/978-3-031-42448-9_23
NLP Topic
Number of units
4653
Type of units
Tweets
Training set size
3194
Test set size
969
Development set size
490

If you have published a result better than those on the list, send a message to odesia-comunicacion@lsi.uned.es indicating the result and the DOI of the article, along with a copy of it if it is not published openly.