CA-Stream: Attention-based pooling for interpretable image recognition

Felipe Torres; Hanwei Zhang; Ronan Sicre; Stéphane Ayache; Yannis Avrithis

Communication Dans Un Congrès Année : 2024

CA-Stream: Attention-based pooling for interpretable image recognition

(1) , (1) , (1) , (1) , (2)

1
2

Felipe Torres

Fonction : Auteur
PersonId : 1376426

éQuipe d'AppRentissage de MArseille

Hanwei Zhang

Fonction : Auteur
PersonId : 1376427

éQuipe d'AppRentissage de MArseille

Ronan Sicre

Fonction : Auteur
PersonId : 1067988

éQuipe d'AppRentissage de MArseille

Stéphane Ayache

Fonction : Auteur
PersonId : 16733
IdHAL : stephane-ayache
ORCID : 0000-0003-2982-7127
IdRef : 129313254

éQuipe d'AppRentissage de MArseille

Yannis Avrithis

Fonction : Auteur
PersonId : 20705
IdHAL : yannis-avrithis
ORCID : 0000-0001-7476-4482
IdRef : 253126193

Institute of Advanced Research in Artificial Intelligence [Vienna]

Résumé

Explanations obtained from transformer-based architectures in the form of raw attention, can be seen as a class-agnostic saliency map. Additionally, attention-based pooling serves as a form of masking the in feature space. Motivated by this observation, we design an attention-based pooling mechanism intended to replace Global Average Pooling (GAP) at inference. This mechanism, called Cross-Attention Stream (CA-Stream), comprises a stream of cross attention blocks interacting with features at different network depths. CA-Stream enhances interpretability in models, while preserving recognition performance.

Mots clés

EXplainable AI XAI interpretability attention-based models image classification

Domaines

Vision par ordinateur et reconnaissance de formes [cs.CV]

Fichier principal

WACV_ICCVround2-4.pdf (2.29 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

ronan sicre : Connectez-vous pour contacter le contributeur

https://hal.science/hal-04551613

Soumis le : jeudi 18 avril 2024-16:20:42

Dernière modification le : samedi 20 avril 2024-03:32:48

Dates et versions

hal-04551613 , version 1 (18-04-2024)

Identifiants

HAL Id : hal-04551613 , version 1

Citer

Felipe Torres, Hanwei Zhang, Ronan Sicre, Stéphane Ayache, Yannis Avrithis. CA-Stream: Attention-based pooling for interpretable image recognition. XAI4CV workshop (CVPR), Jun 2024, Seatle, WA, United States. ⟨hal-04551613⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-TLN CNRS UNIV-AMU GENCI LIS-LAB AMIDEX ANR INCIAM

0 Consultations

0 Téléchargements

CA-Stream: Attention-based pooling for interpretable image recognition

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager