MINT, a new Multilingual intimacy analysis dataset covering 13,372 tweets in 10 languages including English, French, Spanish, Italian, Portuguese, Korean, Dutch, Chinese, Hindi, and Arabic. Tweets are annotated using a 1-5 likert scale.
Language(s)
Spanish
English
Dataset description link
Year
2023
Domain
General
Text types
Tweets
Data access
Registration
Publication
Jiaxin Pei, Vítor Silva, Maarten Bos, Yozen Liu, Leonardo Neves, David Jurgens, and Francesco Barbieri. 2023. SemEval-2023 Task 9: Multilingual Tweet Intimacy Analysis. In Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), pages 2235–2246, Toronto, Canada. Association for Computational Linguistics.
Publication link
NLP Topic
Number of units
1991
Type of units
Tweets
Size - additional information
intimacy level from 1 to 5