KIT | KIT-Bibliothek | Impressum | Datenschutz

Finding the needle in the haystack—An interpretable sequential pattern mining method for classification problems

Grote, Alexander 1; Hariharan, Anuja ORCID iD icon 1; Weinhardt, Christof ORCID iD icon 1
1 Institut für Wirtschaftsinformatik (WIN), Karlsruher Institut für Technologie (KIT)

Abstract:

Introduction: The analysis of discrete sequential data, such as event logs and customer clickstreams, is often challenged by the vast number of possible sequential patterns. This complexity makes it difficult to identify meaningful sequences and derive actionable insights.

Methods: We propose a novel feature selection algorithm, that integrates unsupervised sequential pattern mining with supervised machine learning. Unlike existing interpretable machine learning methods, we determine important sequential patterns during the mining process, eliminating the need for post-hoc classification to assess their relevance. Compared to existing interesting measures, we introduce a local, class-specific interestingness measure that is inherently interpretable.

Results: We evaluated the algorithm on three diverse datasets - churn prediction, malware sequence analysis, and a synthetic dataset - covering different sizes, application domains, and feature complexities. Our method achieved classification performance comparable to established feature selection algorithms while maintaining interpretability and reducing computational costs.

Discussion: This study demonstrates a practical and efficient approach for uncovering important sequential patterns in classification tasks. ... mehr


Verlagsausgabe §
DOI: 10.5445/IR/1000186246
Veröffentlicht am 29.10.2025
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Wirtschaftsinformatik (WIN)
Publikationstyp Zeitschriftenaufsatz
Publikationsjahr 2025
Sprache Englisch
Identifikator ISSN: 2624-909X
KITopen-ID: 1000186246
Erschienen in Frontiers in Big Data
Verlag Frontiers Media SA
Band 8
Seiten Art.-Nr.: 1604887
Vorab online veröffentlicht am 24.10.2025
Nachgewiesen in Scopus
Dimensions
OpenAlex
KIT – Die Universität in der Helmholtz-Gemeinschaft
KITopen Landing Page