Ultra-marginal Feature Importance: Learning from Data with Causal Guarantees

Janssen, Joseph; Guan, Vincent; Robeva, Elina

Statistics > Machine Learning

arXiv:2204.09938 (stat)

[Submitted on 21 Apr 2022 (v1), last revised 11 Nov 2024 (this version, v5)]

Title:Ultra-marginal Feature Importance: Learning from Data with Causal Guarantees

Authors:Joseph Janssen, Vincent Guan, Elina Robeva

View PDF HTML (experimental)

Abstract:Scientists frequently prioritize learning from data rather than training the best possible model; however, research in machine learning often prioritizes the latter. Marginal contribution feature importance (MCI) was developed to break this trend by providing a useful framework for quantifying the relationships in data. In this work, we aim to improve upon the theoretical properties, performance, and runtime of MCI by introducing ultra-marginal feature importance (UMFI), which uses dependence removal techniques from the AI fairness literature as its foundation. We first propose axioms for feature importance methods that seek to explain the causal and associative relationships in data, and we prove that UMFI satisfies these axioms under basic assumptions. We then show on real and simulated data that UMFI performs better than MCI, especially in the presence of correlated interactions and unrelated features, while partially learning the structure of the causal graph and reducing the exponential runtime of MCI to super-linear.

Subjects:	Machine Learning (stat.ML); Information Theory (cs.IT); Machine Learning (cs.LG); Applications (stat.AP)
Cite as:	arXiv:2204.09938 [stat.ML]
	(or arXiv:2204.09938v5 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2204.09938
Journal reference:	In International conference on artificial intelligence and statistics (pp. 10782-10814). PMLR 2023

Submission history

From: Joseph Janssen [view email]
[v1] Thu, 21 Apr 2022 07:54:58 UTC (6,026 KB)
[v2] Mon, 13 Jun 2022 17:33:12 UTC (14,009 KB)
[v3] Sun, 17 Jul 2022 05:10:49 UTC (14,337 KB)
[v4] Thu, 16 Feb 2023 23:25:26 UTC (14,084 KB)
[v5] Mon, 11 Nov 2024 08:34:20 UTC (5,163 KB)

Statistics > Machine Learning

Title:Ultra-marginal Feature Importance: Learning from Data with Causal Guarantees

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Ultra-marginal Feature Importance: Learning from Data with Causal Guarantees

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators