KIT | KIT-Bibliothek | Impressum | Datenschutz

Estimating Dependency, Monitoring and Knowledge Discovery in High-Dimensional Data Streams

Fouché, Edouard

Abstract:
Data Mining – known as the process of extracting knowledge from massive data sets – leads to phenomenal impacts on our society, and now affects nearly every aspect of our lives: from the layout in our local grocery store, to the ads and product recommendations we receive, the availability of treatments for common diseases, the prevention of crime, or the efficiency of industrial production processes.
However, Data Mining remains difficult when (1) data is high-dimensional, i.e., has many attributes, and when (2) data comes as a stream. Extracting knowledge from high-dimensional data streams is impractical because one must cope with two orthogonal sets of challenges. On the one hand, the effects of the so-called "curse of dimensionality" bog down the performance of statistical methods and yield to increasingly complex Data Mining problems. On the other hand, the statistical properties of data streams may evolve in unexpected ways, a phenomenon known in the community as "concept drift". Thus, one needs to update their knowledge about data over time, i.e., to monitor the stream.
While previous work addresses high-dimensional data sets and data streams to some extent, the intersection of both has received much less attention. ... mehr

Open Access Logo


Volltext §
DOI: 10.5445/IR/1000127232
Veröffentlicht am 08.12.2020
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Programmstrukturen und Datenorganisation (IPD)
Publikationstyp Hochschulschrift
Publikationsdatum 08.12.2020
Sprache Englisch
Identifikator KITopen-ID: 1000127232
Verlag Karlsruher Institut für Technologie (KIT)
Umfang X, 164 S.
Art der Arbeit Dissertation
Fakultät Fakultät für Informatik (INFORMATIK)
Institut Institut für Programmstrukturen und Datenorganisation (IPD)
Prüfungsdatum 15.07.2020
Referent/Betreuer Prof. K. Böhm
Schlagwörter Data Mining, Data Stream Monitoring, Multivariate Statistics, Online Learning Algorithms, Predictive Maintenance, Anomaly Detection
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page