KIT | KIT-Bibliothek | Impressum | Datenschutz

Alternative feature selection with user control

Bach, Jakob ORCID iD icon 1,2; Böhm, Klemens 1,2
1 Institut für Programmstrukturen und Datenorganisation (IPD), Karlsruher Institut für Technologie (KIT)
2 Fakultät für Informatik (INFORMATIK), Karlsruher Institut für Technologie (KIT)

Abstract:

Feature selection is popular for obtaining small, interpretable, yet highly accurate prediction models. Conventional feature-selection methods typically yield one feature set only, which does not suffice in certain scenarios. For example, users might be interested in finding alternative feature sets with similar prediction quality, offering different explanations of the data. In this article, we introduce alternative feature selection and formalize it as an optimization problem. In particular, we define alternatives via constraints and enable users to control the number and dissimilarity of alternatives. Next, we analyze the complexity of this optimization problem and show NP-hardness. Further, we discuss how to integrate conventional feature-selection methods as objectives. Finally, we evaluate alternative feature selection in comprehensive experiments with 30 datasets representing binary-classification problems. We observe that alternative feature sets may indeed have high prediction quality, and we analyze factors influencing this outcome.


Verlagsausgabe (Version 2) §
DOI: 10.5445/IR/1000169613/v2
Veröffentlicht am 27.05.2024
Verlagsausgabe (Version 1) §
DOI: 10.5445/IR/1000169613
Veröffentlicht am 27.03.2024
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Programmstrukturen und Datenorganisation (IPD)
Publikationstyp Zeitschriftenaufsatz
Publikationsjahr 2024
Sprache Englisch
Identifikator ISSN: 2364-415X, 2364-4168
KITopen-ID: 1000169613
Erschienen in International Journal of Data Science and Analytics
Verlag Springer
Vorab online veröffentlicht am 26.03.2024
Externe Relationen Forschungsdaten/Software
Schlagwörter Feature selection, Alternatives, Constraints, Mixed-integer programming, Explainability, Interpretability, XAI
Nachgewiesen in Dimensions
Scopus
Relationen in KITopen
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page