Datasets
Below is information about Spanish textual data sets created with the goal of solving NLP tasks. In this case, these are collections of texts, generally enriched with annotations.
-
The EA-MT dataset
Spanish , English , Arabic , Deuch , French , Italian , Korean , ChinesePublished in 202557611.00MBmachine translation -
TA1C 2024
NewsSpanishPublished in 20253,5003500.00MBTweetsclickbait detection -
SatirA 2025
DiverseSpanishPublished in 20258,0008000.00MBsatire detection -
Rest-Mex 2025
Spanish (Mexico)Published in 2025297,217297217.00MBReviewssentiment analysis -
IC-UNED-RC-ES_v1
DiverseSpanishPublished in 20252,6302630.00MBlanguage comprehension -
Spa-DataBench
DiverseSpanishPublished in 2025300300.00MBquestion answering -
PolyHope-2025 V2
SocialSpanish , EnglishPublished in 202529,95729957.00MBTweetssentiment analysis -
PastReader-2025
SpanishPublished in 202512,19512195.00MBautomatic transcription -
MiSonGyny-2025
SpanishPublished in 20252,6312631.00MBhate detection -
HOMO-LAT-2025
SocialSpanish (Argentina) , Spanish (Bolivia) , Spanish (Chile) , Spanish (Colombia) , Spanish (Dominican Republic) , Spanish (Mexico) , Spanish (Peru) , Spanish (Uruguay)Published in 20257,1007100.00MBhate detection -
DIMEMEX-2025
SocialSpanish (Mexico)Published in 20253,0003000.00MBhate detection -
CLEAR-2025
NewsSpanishPublished in 20253,0003000.00MBtext simplification -
MentalRiskES-2025
HealthSpanishPublished in 202532,34232342.00MBprofiling -
semeval-2025-task11-emotions-es
DiverseSpanish , EnglishPublished in 20253,875sentiment analysis -
MultiClaim-2025-es
SocialSpanish , EnglishPublished in 20257,581Publicaciones de redes socialesfake news detection
Pagination
If you have published a result better than those on the list, send a message to odesia-comunicacion@lsi.uned.es indicating the result and the DOI of the article, along with a copy of it if it is not published openly.

