What is the ODESIA portal?
The ODESIA web portal gathers information about the state of the art of natural language processing in Spanish. It provides information about forums, competitions, datasets and tasks.
What information may I find in the ODESIA portal?
The information contained in the ODESIA portal includes: what tasks exist for NLP in Spanish; what results are SOTA for a given task in Spanish; what tasks have been addressed in relation to a specific NLP topic; what datasets exist for NLP in Spanish; what datasets are available for a given task, NLP topic, domain, type of text
or language variety; what competitions have been organized for Spanish; what are the main forums
that organise NLP competitions in Spanish; or, how many competitions have been organised per year.
What is a forum?
A forum in the ODESIA portal is an abstract entity that refers to the umbrella under which several competitions are organised. In some cases it is described as an evaluation campaign, such as IberLEF, 9 or as a series of workshops such as SemEval.
What is a competition?
A competition (or shared task) in the ODESIA portal is an event that usually takes place within the framework of a forum. In a competition one or more NLP tasks are proposed. The competition organizers provide some
of the scientific elements needed to develop and evaluate NLP systems, such as a task definition, datasets with relevant annotations, and the evaluation method. An example of a competition is “BioASQ 2023: Large-scale Biomedical Semantic Indexing and Question Answering”, organized within the forum CLEF 2023. In a competition, more than one task can be proposed.
What is a task?
An NLP task is a scientific activity proposed by the organizers of a competition with the aim of solving a specific NLP problem within the framework of that competition. The organizers of a competition are responsible for defining the tasks (sometimes called subtasks) of the competition and for providing one or more datasets with
their partitions, which participants use to develop their systems. In addition, the organizers provide
evaluation software to evaluate each task. The organizers also have to determine which metrics are
used to evaluate each task. Thus, important information about a task, in addition to the dataset and problem definition, is the metric used to evaluate it. Participants in a competition develop systems and submit solutions per
task, obtaining some evaluation results.
What is a dataset?
An NLP dataset is a collection of texts provided usually with annotations and sometimes with additional multi-modal data. The competition organizers provide datasets for each edition of the competition. In general, one dataset is usually provided for all tasks in a competition, although sometimes more than one dataset is provided, depending on whether the tasks require different annotations, the languages and linguistic varieties addressed in the competition, the type of texts, and so on. Generally organizers provide several partitions of the dataset (train-
ing, development, test).
What is a NLP system?
An NLP system is a concrete implementation of an NLP tool or model that solves a task (or several) and obtains a specific score. Participants in a competition develop a system (or several) to solve a task (or several). Given an input, the system performs the mapping to an output according to the task description. Each system gets a score which is used to rank it.
What is a NLP topic?
A NLP Topic is a label that groups NLP tasks by the type of NLP problem that they solve. All tasks and datasets
are assigned one or more NLP topics. Examples of NLP topics are chatbots, processing events, discourse processing, processing humor, entity linking, text simplification or parsing.
What is an abstract NLP task?
Abstract NLP task refers to the type of NLP task from the point of view of the automatic learning problem to be solved. The following types are defined: classification, sequence labeling, regression, clustering, correlation, and diversification.
Do I need to register to view this information?
No, all information contained on the portal is public and does not require registration.