Dataset of 1,000 clinical case reports manually annotated by multiple clinical experts with clinical procedures. The case reports were selected by clinical experts and belong to various medical specialties including, amongst others, oncology, odontology, urology, and psychiatry. They are the same text documents that were used for the corpus and shared task on diseases DisTEMIST, building towards a collection of fully-annotated texts for clinical concept recognition and normalization. In addition to the text annotations, the mentions in the corpus have been normalized to SNOMED CT.
Language(s)
Spanish
Dataset description link
Year
2023
Domain
Health
Text types
Clinical notes
Annotations
clinical procedures, SNOMED CT normalization
Annotation guide link
Data access
Public
Data link
Publication
Nentidis, A. et al. (2023). Overview of BioASQ 2023: The Eleventh BioASQ Challenge on Large-Scale Biomedical Semantic Indexing and Question Answering. In: Arampatzis, A., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2023. Lecture Notes in Computer Science, vol 14163. Springer, Cham. https://doi.org/10.1007/978-3-031-42448-9_19
Publication link
License
CC-BY-4.0
NLP Topic
Number of units
1000
Training set size
750
Test set size
250