Why Aren’t We NER Yet Artifacts of ASR Errors in Named Entity Recognition in Spontaneous Speech Transcripts
Szymański, Piotr, Lukasz Augustyniak, Mikolaj Morzy, Adrian Szymczak, Krzysztof Surdyk, and Piotr Żelasko. 2023. “Why Aren’t We NER Yet? Artifacts of ASR Errors in Named Entity Recognition in Spontaneous Speech Transcripts.” Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1746–61. https://doi.org/10.18653/v1/2023.acl-long.98.
Notes
- NER models are difficult to be used on the ASR transcripts as the ASR transliterations are not exactly accurate
- ASR models can lead to following errors:
- Insertion
- Substitution
- Deletion
- NER models can introduce the following errors:
- hallucination
- replacement
- omission