Incorporating Named Entity Recognition into the Speech Transcription Process

Mohamed Hatmi; Christine Jacquin; Emmanuel Morin; Sylvain Meigner

Conference Papers Year : 2013

Incorporating Named Entity Recognition into the Speech Transcription Process

(1) , (1) , (1) , (2)

1
2

Mohamed Hatmi

Function : Author
PersonId : 929600

Laboratoire d'Informatique de Nantes Atlantique

Christine Jacquin

Function : Author
PersonId : 4167
IdHAL : christine-jacquin

Laboratoire d'Informatique de Nantes Atlantique

Emmanuel Morin

Function : Author
PersonId : 3632
IdHAL : emmanuel-morin
ORCID : 0000-0001-8208-7039
IdRef : 14379373X

Laboratoire d'Informatique de Nantes Atlantique

Sylvain Meigner

Function : Author
PersonId : 11674
IdHAL : sylvain-meignier
ORCID : 0000-0001-7687-073X
IdRef : 182269086

Laboratoire d'Informatique de l'Université du Mans

Abstract

Named Entity Recognition (NER) from speech usually involves two sequential steps: transcribing the speech using Automatic Speech Recognition (ASR) and annotating the outputs of the ASR process using NER techniques. Recognizing named entities in automatic transcripts is difficult due to the presence of transcription errors and the absence of some important NER clues, such as capitalization and punctuation. In this paper, we describe a methodology for speech NER which consists of incorporating NER into the ASR process so that the ASR system generates transcripts annotated with named entities. The combination is achieved by adapting ASR language models and pre-annotating the pronunciation dictionary. We evaluate this method on Ester 2 corpus, and show significant improvements over traditional approaches.

Keywords

Named Entity Recognition Automatic Speech Recognition language modeling ASR vocabulary

Domains

Computer science

Fichier principal

hatmi.pdf (343.27 Ko)

Origin : Publisher files allowed on an open archive

Mohamed Hatmi : Connect in order to contact the contributor

https://hal.science/hal-00843211

Submitted on : Friday, November 8, 2013-3:57:16 PM

Last modification on : Friday, January 5, 2024-3:23:40 AM

Long-term archiving on: Sunday, February 9, 2014-2:45:26 AM

Dates and versions

hal-00843211 , version 1 (08-11-2013)

Identifiers

HAL Id : hal-00843211 , version 1

Cite

Mohamed Hatmi, Christine Jacquin, Emmanuel Morin, Sylvain Meigner. Incorporating Named Entity Recognition into the Speech Transcription Process. Proceedings of the 14th Annual Conference of the International Speech Communication Association (Interspeech'13), Aug 2013, Lyon, France. pp.3732-3736. ⟨hal-00843211⟩

Export

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-NANTES CNRS UNIV-LEMANS LINA LINA-TALN LIUM LIUM-LST LS2N NANTES-UNIVERSITE

430 View

1051 Download

Incorporating Named Entity Recognition into the Speech Transcription Process

Abstract

Keywords

Domains

Dates and versions

Identifiers

Cite

Export

Collections

Share