Why Aren’t We NER Yet Artifacts of ASR Errors in Named Entity Recognition in Spontaneous Speech Transcripts

Szymański, Piotr, Lukasz Augustyniak, Mikolaj Morzy, Adrian Szymczak, Krzysztof Surdyk, and Piotr Żelasko. 2023. “Why Aren’t We NER Yet? Artifacts of ASR Errors in Named Entity Recognition in Spontaneous Speech Transcripts.” Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1746–61. https://doi.org/10.18653/v1/2023.acl-long.98.

Notes

NER models are difficult to be used on the ASR transcripts as the ASR transliterations are not exactly accurate
ASR models can lead to following errors:
- Insertion
- Substitution
- Deletion
NER models can introduce the following errors:
- hallucination
- replacement
- omission

Notes

In-text annotations