MEDDOPROF

Language(s)
Spanish
Year
2021
Domain
Health
Text types
Clinical records from journals
Annotations
occupations, occupation holders, SNOMED-CT codes for occupations
Format
UTF-8 text files with the annotations as separate files in brat standoff format (.ann files)
Data access
OpenAccess

Number of units
1844
Type of units
Documents
Tokens
1291186
Sentences
58627
Documents
1844
Training set size
1500 docs/49114 sentences/1075655 tokens
Test set size
344 docs/9513 sentences/215531 tokens

If you have published a result better than those on the list, send a message to odesia-comunicacion@lsi.uned.es indicating the result and the DOI of the article, along with a copy of it if it is not published openly.