A fully differentiable model for unsupervised singing voice separation

Gael Richard; Pierre Chouteau; Bernardo Torres

Communication Dans Un Congrès Année : 2024

A fully differentiable model for unsupervised singing voice separation

(1, 2) , (1, 2) , (1, 2)

1
2

Gael Richard

Fonction : Auteur
PersonId : 14146
IdHAL : gael-richard
IdRef : 094977208

Signal, Statistique et Apprentissage

Département Images, Données, Signal

Pierre Chouteau

Fonction : Auteur
PersonId : 1328515

Signal, Statistique et Apprentissage

Département Images, Données, Signal

Bernardo Torres

Fonction : Auteur
PersonId : 1278938
IdHAL : bernardo-torres
ORCID : 0009-0005-7051-6736

Signal, Statistique et Apprentissage

Département Images, Données, Signal

Résumé

A novel model was recently proposed by Schulze-Forster et al. in [1] for unsupervised music source separation. This model allows to tackle some of the major shortcomings of existing source separation frameworks. Specifically, it eliminates the need for isolated sources during training, performs efficiently with limited data, and can handle homogeneous sources (such as singing voice). But, this model relies on an external multipitch estimator and incorporates an Ad hoc voice assignment procedure. In this paper, we propose to extend this framework and to build a fully differentiable model by integrating a multipitch estimator and a novel differentiable assignment module within the core model. We show the merits of our approach through a set of experiments, and we highlight in particular its potential for processing diverse and unseen data.

Mots clés

Unsupervised source separation multiple singing voices differentiable models deep learning Unsupervised source separation multiple singing voices differentiable models deep learning

Domaines

Traitement du signal et de l'image [eess.SP]

Fichier principal

main.pdf (1.65 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Gaël RICHARD : Connectez-vous pour contacter le contributeur

https://telecom-paris.hal.science/hal-04356813

Soumis le : lundi 29 janvier 2024-14:55:17

Dernière modification le : mercredi 31 janvier 2024-03:42:25

Dates et versions

hal-04356813 , version 1 (20-12-2023)

hal-04356813 , version 2 (29-01-2024)

Identifiants

HAL Id : hal-04356813 , version 2
ARXIV : 2401.16837

Citer

Gael Richard, Pierre Chouteau, Bernardo Torres. A fully differentiable model for unsupervised singing voice separation. IEEE International Conference on Acoustics, Speech, and Signal Processing, Apr 2024, Seoul, South Korea. ⟨hal-04356813v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM PARISTECH LTCI IDS S2A IP_PARIS

223 Consultations

140 Téléchargements

A fully differentiable model for unsupervised singing voice separation

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager