AuTexTification 2023

The AuTexTification dataset consists of texts written by humans and LLMs in five domains: tweets, reviews, how-to articles, news and legal documents. 

Language(s)
Spanish
English
Year
2023
Domain
General
Legal
News
Data access
Registration

Publication
Areg Mikael Sarvazyan, José Ángel González, Marc Franco-Salvador, Francisco Rangel, Berta Chulvi, Paolo Rosso (2023) Overview of AuTexTification at IberLEF 2023: Detection and Attribution of Machine-Generated Text in Multiple Domains. Procesamiento del Lenguaje Natural, Revista nº 71, septiembre de 2023, pp. 275-288.
Number of units
52191
Type of units
Samples of texts
Size - additional information

model generated or not, attributed model

If you have published a result better than those on the list, send a message to odesia-comunicacion@lsi.uned.es indicating the result and the DOI of the article, along with a copy of it if it is not published openly.