Evaluation of feature-embedding methods for word spotting in historical arabic documents - IRT SystemX Accéder directement au contenu
Communication Dans Un Congrès Année : 2020

Evaluation of feature-embedding methods for word spotting in historical arabic documents

Résumé

Retrieving and indexing historical Arabic documents remain a very significant challenge. The purpose of this paper is to compare the feature representation spaces for word spotting in historical Arabic documents. Our goal is to create embedding spaces using the characteristics of different machine learning methods: i) linear such as principal component analysis and linear discriminant analysis, and ii) non-linear including convolutional neural networks for triplets and Siamese. Subsequently, each word image is represented by a dense vector. Thus, to match feature representations, a Euclidean distance is used. An evaluation of various representation space models is presented. The embedding word models are evaluated on the VML-HD dataset, and the experiments show the effectiveness of non-linear methods compared to linear ones.
Fichier non déposé

Dates et versions

hal-03094910 , version 1 (04-01-2021)

Identifiants

Citer

Abir Fathallah, Mohamed Ibn Khedher, Mounim El Yacoubi, Najoua Essoukri Ben Amara. Evaluation of feature-embedding methods for word spotting in historical arabic documents. SSD 2020: 17th international multi-conference on Systems, Signals and Devices, Jul 2020, Monastir (online), Tunisia. pp.34-39, ⟨10.1109/SSD49366.2020.9364134⟩. ⟨hal-03094910⟩
50 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More