Single-speaker/multi-speaker co-channel speech classification

Stéphane Rossignol; Olivier Pietquin

Communication Dans Un Congrès Année : 2010

Single-speaker/multi-speaker co-channel speech classification

(1) , (1)

Stéphane Rossignol

Fonction : Auteur
PersonId : 7798
IdHAL : stephane-rossignol
ORCID : 0000-0002-3077-6411
IdRef : 05931740X

SUPELEC-Campus Metz

Olivier Pietquin

Fonction : Auteur
PersonId : 4024
IdHAL : olivier-pietquin
ORCID : 0000-0002-5386-465X
IdRef : 142821861

SUPELEC-Campus Metz

Résumé

The demand for content-based management and real-time manipulation of audio data is constantly increasing. This paper presents a method to identify temporal regions, in a segment of co-channel speech, as being either single-speaker or multi- speaker speech. The state of the art approach for this purpose is the kurtosis. In this paper, a set of complementary time- domain and frequency-domain features is studied. The employed classification scheme is the one-class SVM classifier. A recognition rate of 94.75 % is reached. The set of features providing the best performance is determined.

Domaines

Son [cs.SD]

Sébastien Van Luchene : Connectez-vous pour contacter le contributeur

https://centralesupelec.hal.science/hal-00552948

Soumis le : jeudi 6 janvier 2011-11:29:16

Dernière modification le : mardi 14 février 2023-03:38:31

Dates et versions

hal-00552948 , version 1 (06-01-2011)

Identifiants

HAL Id : hal-00552948 , version 1

Citer

Stéphane Rossignol, Olivier Pietquin. Single-speaker/multi-speaker co-channel speech classification. Interspeech 2010, Sep 2010, Makuhari, Japan. pp.2322-2325. ⟨hal-00552948⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

SUPELEC CENTRALESUPELEC

29 Consultations

0 Téléchargements

Single-speaker/multi-speaker co-channel speech classification

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager