KIT | KIT-Bibliothek | Impressum | Datenschutz

Accelerated Computation of a High Dimensional Kolmogorov-Smirnov Distance

Hagen, Alex; Jackson, Shane; Kahn, James ORCID iD icon; Strube, Jan; Haide, Isabel; Pazdernik, Karl; Hainje, Connor

Abstract:

Statistical testing is widespread and critical for a variety of scientific disciplines. The advent of machine learning and the increase of computing power has increased the interest in the analysis and statistical testing of multidimensional data. We extend the powerful Kolmogorov-Smirnov two sample test to a high dimensional form in a similar manner to Fasano (Fasano, 1987). We call our result the d-dimensional Kolmogorov-Smirnov test (ddKS) and provide three novel contributions therewith: we develop an analytical equation for the significance of a given ddKS score, we provide an algorithm for computation of ddKS on modern computing hardware that is of constant time complexity for small sample sizes and dimensions, and we provide two approximate calculations of ddKS: one that reduces the time complexity to linear at larger sample sizes, and another that reduces the time complexity to linear with increasing dimension. We perform power analysis of ddKS and its approximations on a corpus of datasets and compare to other common high dimensional two sample tests and distances: Hotelling's T^2 test and Kullback-Leibler divergence. Our ddKS test performs well for all datasets, dimensions, and sizes tested, whereas the other tests and distances fail to reject the null hypothesis on at least one dataset. ... mehr


Zugehörige Institution(en) am KIT Institut für Experimentelle Teilchenphysik (ETP)
Scientific Computing Center (SCC)
Universität Karlsruhe (TH) – Zentrale Einrichtungen (Zentrale Einrichtungen)
Publikationstyp Forschungsbericht/Preprint
Publikationsmonat/-jahr 06.2021
Sprache Englisch
Identifikator KITopen-ID: 1000137223
HGF-Programm 46.21.04 (POF IV, LK 01) HAICU
Serie arXiv e-prints
Vorab online veröffentlicht am 25.06.2021
Nachgewiesen in arXiv
Relationen in KITopen
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page