The task consists in detecting and labeling semantically ambiguous and complex entities in short and low-context settings. Complex NEs, like the titles of creative works (movie/book/song/software names) are not simple nouns and are harder to recognize. They can take the form of any linguistic constituent, like an imperative clause (“Dial M for Murder”), and do not look like traditional NEs (Person names, locations, organizations).
The task is performed on the MULTICONER dataset (Malmasi et al., 2022). MULTICONER provides data from three domains (Wikipedia sentences, questions, and search queries) across 11 different languages, which are used to define 11 monolingual subsets of the shared task. Additionally, the dataset has multilingual and code-mixed subsets.
The following named entities are tagged: names of people, location or physical facilities, corporations and businesses, all other groups, consumer products, titles of creative works like movie, song, and book titles.