KIT | KIT-Bibliothek | Impressum | Datenschutz

Subgroup Discovery with Small and Alternative Feature Sets

Bach, Jakob ORCID iD icon

Abstract:

Subgroup-discovery methods find interesting regions in a dataset. In this article, we analyze two constraint types to enhance the interpretability of subgroups: First, we make subgroup descriptions small by limiting the number of features used. Second, we propose the novel problem of finding alternative subgroup descriptions, which cover a similar set of data objects as a given subgroup but use different features. We describe how to integrate both constraint types into heuristic subgroup-discovery methods as well as a novel Satisfiability Modulo Theories (SMT) formulation, which enables a solver-based search for subgroups. Further, we prove NP-hardness of optimization with either constraint type. Finally, we evaluate unconstrained and constrained subgroup discovery with 27 binary-classification datasets. We observe that heuristic search methods often yield high-quality subgroups fast, even with constraints.


Verlagsausgabe §
DOI: 10.5445/IR/1000182481
Veröffentlicht am 20.06.2025
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Programmstrukturen und Datenorganisation (IPD)
Publikationstyp Zeitschriftenaufsatz
Publikationsdatum 18.06.2025
Sprache Englisch
Identifikator ISSN: 2836-6573
KITopen-ID: 1000182481
Erschienen in Proceedings of the ACM on Management of Data
Verlag Association for Computing Machinery (ACM)
Band 3
Heft 3
Seiten 221
Externe Relationen Siehe auch
Forschungsdaten/Software
Schlagwörter Subgroup Discovery, Constraints, Feature Selection, Alternatives, Satisfiability Modulo Theories, Explainability, Interpretability, XAI
Nachgewiesen in Dimensions
OpenAlex
Relationen in KITopen
KIT – Die Universität in der Helmholtz-Gemeinschaft
KITopen Landing Page