KIT | KIT-Bibliothek | Impressum | Datenschutz

Efficient Generation of Hidden Outliers for Improved Outlier Detection

Cribeiro-Ramallo, Jose; Arzamasov, Vadim 1; Böhm, Klemens 2
1 Institut für Theoretische Informatik (ITI), Karlsruher Institut für Technologie (KIT)
2 Institut für Programmstrukturen und Datenorganisation (IPD), Karlsruher Institut für Technologie (KIT)

Abstract:

Outlier generation is a popular technique used for solving important outlier detection tasks. Generating outliers with realistic behavior is challenging. Popular existing methods tend to disregard the 'multiple views' property of outliers in high-dimensional spaces. The only existing method accounting for this property falls short in efficiency and effectiveness. We propose BISECT, a new outlier generation method that creates realistic outliers mimicking said property. To do so, BISECT employs a novel proposition introduced in this article stating how to efficiently generate said realistic outliers. Our method has better guarantees and complexity than the current methodology for recreating 'multiple views'. We use the synthetic outliers generated by BISECT to effectively enhance outlier detection in diverse datasets, for multiple use cases. For instance, oversampling with BISECT reduced the error by up to 3 times when compared with the baselines.


Volltext §
DOI: 10.5445/IR/1000176780
Veröffentlicht am 02.12.2024
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Programmstrukturen und Datenorganisation (IPD)
Institut für Theoretische Informatik (ITI)
Publikationstyp Forschungsbericht/Preprint
Publikationsjahr 2024
Sprache Englisch
Identifikator KITopen-ID: 1000176780
Verlag arxiv
Schlagwörter Machine Learning (cs.LG)
Nachgewiesen in Dimensions
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page