SMM4H-tweets-es-2022

Classification of tweets of self-reported COVID-19 symptoms in Spanish. The annotated set of data for this task is a manually curated set of Spanish-native language tweets. The task is a three-way classification problem, requiring participants to distinguish personal symptom mentions (self-reports) from other mentions such as symptoms reported by others (non-personal reports) and references to external sources (literature/news mentions).

Language(s)
Spanish
Year
2022
Domain
Health
Text types
Tweets
Annotations
self-report, non-personal reports and literature/news mentions

Number of units
20481
Type of units
Tweets
Training set size
10052 tweets
Test set size
6851 tweets
Development set size
3578 tweets

If you have published a result better than those on the list, send a message to odesia-comunicacion@lsi.uned.es indicating the result and the DOI of the article, along with a copy of it if it is not published openly.