This task addresses the lack of resources in Spanish for named entity recognition (NER) and genomic variants, being the first of its kind. It is based on a corpus curated by experts that covers mutations and entities related to variants (genes, diseases, and symptoms). The proposal aims to improve the training of NER models in a domain with limited resources, overcoming the limitations of current tools based on regular expressions. Since NER datasets for variants are scarce even in English, this work is crucial for advancing in this field. Inspired by precision medicine and biocuration, it drives research in NLP in Spanish.
Task results
| System | MicroF1 Sort ascending |
|---|---|
| ander.martinez | 0.8210 |
| VictorMov | 0.7935 |
| ELiRF-VRAIN | 0.7349 |
| Milimeter98 | 0.5483 |
| orlandxrf | 0.5301 |

