MultiClaim-2025-es | Portal ODESIA

This dataset consists of fact-checks, social media posts and pairings between them. The dataset consists of 205,751 fact-checks in 39 languages and 28,092 social media posts in 27 languages. All the posts were previously reviewed by professional fact-checkers who also assigned appropriate fact-checks to them. There are 31,305 fact-check-to-post pairs, each post is paired with at least one fact-check. 26,774 of these pairs are monolingual and 4,212 are crosslingual. The dataset introduces crosslingual previously fact-checked claim retrieval (PFCR) as a new task.

Idioma(s)

Español

Inglés

Enlace descripción Dataset

https://zenodo.org/record/7737983

Año

2025

Dominio

Social

Tipo Textos

Publicaciones de redes sociales

Anotaciones

social media post-claim paired

Formato

csv

Enlace guía anotaciones

https://aclanthology.org/2023.emnlp-main.1027.pdf

Acceso a datos

Registro

Enlace acceso a datos

https://zenodo.org/record/7737983

Publicación

MatÃºÂš Pikuliak, Ivan Srba, Robert Moro, Timo Hromadka, Timotej Smole?, Martin MeliÂšek, Ivan Vykopal, Jakub Simko, Juraj PodrouÂžek, and Maria Bielikova. 2023. Multilingual Previously Fact-Checked Claim Retrieval. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 16477Â–16500, Singapore. Association for Computational Linguistics.

Enlace publicación

https://aclanthology.org/2023.emnlp-main.1027/

NLP Topic

detección de noticias falsas

Número de unidades

7581

Tamaño set entrenamiento

6313

Tamaño set evaluación

576

Tamaño set desarrollo

692

Inicie sesión o registrese para enviar comentarios