SMM4H-tweets-es-2022

Classification of tweets of self-reported COVID-19 symptoms in Spanish. The annotated set of data for this task is a manually curated set of Spanish-native language tweets. The task is a three-way classification problem, requiring participants to distinguish personal symptom mentions (self-reports) from other mentions such as symptoms reported by others (non-personal reports) and references to external sources (literature/news mentions).

Language(s)

Spanish

Year

2022

Domain

Health

Text types

Tweets

Annotations

self-report, non-personal reports and literature/news mentions

NLP Topic

text classification

Number of units

20481

Type of units

Tweets

Training set size

10052 tweets

Test set size

6851 tweets

Development set size

3578 tweets

Log in or register to post comments

If you have published a result better than those on the list, send a message to odesia-comunicacion@lsi.uned.es indicating the result and the DOI of the article, along with a copy of it if it is not published openly.