Estimation and imputation in Probabilistic Principal Component Analysis with Missing Not At Random data - Centre de mathématiques appliquées (CMAP) Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2019

Estimation and imputation in Probabilistic Principal Component Analysis with Missing Not At Random data

Résumé

Missing Not At Random values are considered to be non-ignorable and require defining a model for the missing values mechanism which involves strong a priori on the parametric form of the distribution and makes the inference or imputation tasks more complex. Methodologies to handle MNAR values also focus on simple settings assuming that only one variable (such as the outcome one) has missing entries. Recent work of Mohan and Pearl based on graphical models and causality show that specific settings of MNAR enable to recover some aspects of the distribution without specifying the MNAR mechanism. We pursue this line of research. Considering a data matrix generated from a probabilistic principal component analysis (PPCA) model containing several MNAR variables, not necessarily under the same self-masked missing mechanism, we propose estimators for the means, variances and covariances of the variables and study their consistency. The estima- tors present the great advantage of being computed by only using observed data. In addition, we propose an imputation method of the data matrix and an estimation of the PPCA loading matrix. We compare our proposal with results obtained for ignorable missing values based on the use of expectation-maximization algorithm.
Fichier principal
Vignette du fichier
sportisse_boyer_josse_ppca_mnar_2019.pdf (511.8 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-02146983 , version 1 (04-06-2019)
hal-02146983 , version 2 (25-07-2019)
hal-02146983 , version 3 (04-06-2020)

Identifiants

Citer

Aude Sportisse, Claire Boyer, Julie Josse. Estimation and imputation in Probabilistic Principal Component Analysis with Missing Not At Random data. 2019. ⟨hal-02146983v2⟩

Collections

UNIV-PARIS7
196 Consultations
126 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More