MedProcNER/ProcTEMIST corpus 2023

Dataset of 1,000 clinical case reports manually annotated by multiple clinical experts with clinical procedures. The case reports were selected by clinical experts and belong to various medical specialties including, amongst others, oncology, odontology, urology, and psychiatry. They are the same text documents that were used for the corpus and shared task on diseases DisTEMIST, building towards a collection of fully-annotated texts for clinical concept recognition and normalization. In addition to the text annotations, the mentions in the corpus have been normalized to SNOMED CT.

Language(s)
Spanish
Year
2023
Domain
Health
Text types
Clinical notes
Annotations
clinical procedures, SNOMED CT normalization
Data access
Public

Publication
Nentidis, A. et al. (2023). Overview of BioASQ 2023: The Eleventh BioASQ Challenge on Large-Scale Biomedical Semantic Indexing and Question Answering. In: Arampatzis, A., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2023. Lecture Notes in Computer Science, vol 14163. Springer, Cham. https://doi.org/10.1007/978-3-031-42448-9_19
License
CC-BY-4.0
Number of units
1000
Training set size
750
Test set size
250

If you have published a result better than those on the list, send a message to odesia-comunicacion@lsi.uned.es indicating the result and the DOI of the article, along with a copy of it if it is not published openly.