KIT | KIT-Bibliothek | Impressum | Datenschutz

Finding Optimal Diverse Feature Sets with Alternative Feature Selection

Bach, Jakob ORCID iD icon 1
1 Institut für Programmstrukturen und Datenorganisation (IPD), Karlsruher Institut für Technologie (KIT)

Abstract:

Feature selection is popular for obtaining small, interpretable, yet highly accurate prediction models. Conventional feature-selection methods typically yield one feature set only, which might not suffice in some scenarios. For example, users might be interested in finding alternative feature sets with similar prediction quality, offering different explanations of the data. In this article, we introduce alternative feature selection and formalize it as an optimization problem. In particular, we define alternatives via constraints and enable users to control the number and dissimilarity of alternatives. We consider sequential as well as simultaneous search for alternatives. Next, we discuss how to integrate conventional feature-selection methods as objectives. In particular, we describe solver-based search methods to tackle the optimization problem. Further, we analyze the complexity of this optimization problem and prove NP-hardness. Additionally, we show that a constant-factor approximation exists under certain conditions and propose corresponding heuristic search methods. Finally, we evaluate alternative feature selection in comprehensive experiments with 30 binary-classification datasets. ... mehr


Volltext (Version 2) §
DOI: 10.5445/IR/1000160813/v2
Veröffentlicht am 14.02.2024
Volltext (Version 1) §
DOI: 10.5445/IR/1000160813
Veröffentlicht am 24.07.2023
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Programmstrukturen und Datenorganisation (IPD)
Publikationstyp Forschungsbericht/Preprint
Publikationsdatum 21.07.2023
Sprache Englisch
Identifikator KITopen-ID: 1000160813
Verlag arxiv
Umfang 82 S.
Externe Relationen Forschungsdaten/Software
Schlagwörter feature selection, alternatives, constraints, mixed-integer programming, explainability, interpretability, XAI
Nachgewiesen in arXiv
Dimensions
Relationen in KITopen
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page