GenoVarDis: NER in Genomic Variants and related Diseases

This competition addresses the lack of resources in Spanish for named entity recognition (NER) and genomic variants, making it the first of its kind. It is based on an expert-curated corpus covering mutations and variant-related entities (genes, diseases, and symptoms). The proposal aims to enhance the training of NER models in a low-resource domain, overcoming the limitations of current tools based on regular expressions. Since NER datasets for variants are scarce even in English, this work is crucial for advancing the field. Inspired by precision medicine and biocuration, it drives NLP research in Spanish forward.