KIT | KIT-Bibliothek | Impressum | Datenschutz

Selective Concept Bottleneck Models Without Predefined Concepts

Schrodi, Simon; Schur, Julian; Argus, Max; Brox, Thomas

Abstract:

Concept-based models like Concept Bottleneck Models (CBMs) have garnered significant interest for improving model interpretability by first predicting human-understandable concepts before mapping them to the output classes. Early approaches required costly concept annotations. To alleviate such, recent methods utilized large language models to automatically generate class-specific concept descriptions and learned mappings from a pretrained black-box model’s raw features to these concepts using vision-language models. However, these approaches assume prior knowledge of which concepts the black-box model has learned. In this work, we discover the concepts encoded by the model through unsupervised concept discovery techniques instead. We further propose an input-dependent concept selection mechanism that dynamically retains a sparse set of relevant concepts for each input, enhancing both sparsity and interpretability. Our approach not only improves downstream performance but also needs significantly fewer concepts for accurate classification. Lastly, we show how large vision-language models can guide the editing of our models' weights to correct errors.


Preprint §
DOI: 10.5445/IR/1000184076
Veröffentlicht am 21.08.2025
Cover der Publikation
Zugehörige Institution(en) am KIT Karlsruher Institut für Technologie (KIT)
Publikationstyp Zeitschriftenaufsatz
Publikationsmonat/-jahr 05.2025
Sprache Englisch
Identifikator ISSN: 2835-8856
KITopen-ID: 1000184076
Erschienen in Transactions on Machine Learning Research
Verlag OpenReview.net
Band 2025
Seiten 1-36
Bemerkung zur Veröffentlichung The Thirteenth International Conference on Learning Representations, ICLR 2025, Singapore, 24th -28th April 2024
Externe Relationen Abstract/Volltext
Abstract/Volltext
Nachgewiesen in Scopus
KIT – Die Universität in der Helmholtz-Gemeinschaft
KITopen Landing Page