Homo-MEX | Portal ODESIA

The HOMOMEX dataset is designed for the detection and classification of LGBT+phobic hate speech in Spanish from Mexico. It is structured into three levels of analysis: detection of LGBT+phobia in tweets and phrases; identification of types of phobia; and detection in song lyrics containing hate speech.

Language(s)

Spanish (Mexico)

Dataset description link

https://www.codabench.org/competitions/2229/

Year

2024

Annotations

Each instance is assigned a label, depending on the task it is used for, indicating whether it is LGBT+phobic or not, or the type of phobia it refers to, if applicable.

Format

csv

Data access

Register form

Data link

https://www.codabench.org/competitions/2229/

Publication

Gómez-Adorno et al. (2024). Overview of HOMO-MEX at IberLEF 2024: Hate Speech Detection Towards the Mexican Spanish speaking LGBT+ Population. Procesamiento del Lenguaje Natural, Revista, 73: 393-405.

Publication link

http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/download/6626/4018

NLP Topic

hate detection

Number of units

18200

Type of units

Tweets

Training set size

14560

Test set size

3640

Log in or register to post comments