Language(s)
Spanish
Dataset description link
Year
2020
Text types
Clinical case reports
Annotations
CIE10 codes for diagnostic and procedure, textual evidence
Format
plain text in UTF8 encoding, annotations in tab separated format
Data access
Public
Data link
License
Creative Commons Attribution 4.0 International
NLP Topic
Number of units
1000
Type of units
Documents
Tokens
411067
Sentences
16684
Documents
1000
Training set size
500 docs
Test set size
250 docs
Development set size
250 docs