Gaussian Processes for Missing Value Imputation

Jafrasteh, Bahram; Hernández-Lobato, Daniel; Lubián-López, Simón Pedro; Benavente-Fernández, Isabel

Statistics > Machine Learning

arXiv:2204.04648 (stat)

[Submitted on 10 Apr 2022 (v1), last revised 6 May 2022 (this version, v2)]

Title:Gaussian Processes for Missing Value Imputation

Authors:Bahram Jafrasteh, Daniel Hernández-Lobato, Simón Pedro Lubián-López, Isabel Benavente-Fernández

View PDF

Abstract:Missing values are common in many real-life datasets. However, most of the current machine learning methods can not handle missing values. This means that they should be imputed beforehand. Gaussian Processes (GPs) are non-parametric models with accurate uncertainty estimates that combined with sparse approximations and stochastic variational inference scale to large data sets. Sparse GPs can be used to compute a predictive distribution for missing data. Here, we present a hierarchical composition of sparse GPs that is used to predict missing values at each dimension using all the variables from the other dimensions. We call the approach missing GP (MGP). MGP can be trained simultaneously to impute all observed missing values. Specifically, it outputs a predictive distribution for each missing value that is then used in the imputation of other missing values. We evaluate MGP in one private clinical data set and four UCI datasets with a different percentage of missing values. We compare the performance of MGP with other state-of-the-art methods for imputing missing values, including variants based on sparse GPs and deep GPs. The results obtained show a significantly better performance of MGP.

Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG); Applications (stat.AP); Methodology (stat.ME)
Cite as:	arXiv:2204.04648 [stat.ML]
	(or arXiv:2204.04648v2 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2204.04648

Submission history

From: Bahram Jafrasteh [view email]
[v1] Sun, 10 Apr 2022 10:46:26 UTC (3,771 KB)
[v2] Fri, 6 May 2022 09:26:22 UTC (1,517 KB)

Statistics > Machine Learning

Title:Gaussian Processes for Missing Value Imputation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Gaussian Processes for Missing Value Imputation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators