KIT | KIT-Bibliothek | Impressum

Clustering through High Dimensional Data Scaling: Applications and Implementations

Murtagh, Fionn; Contreras, Pedro

Abstract: To analyse very high dimensional data, or large data volumes, we study random projection. Since hierarchically clustered data can be scaled in one dimension, seriation or unidimensional scaling is our primary objective. Having determined a unidimensional scaling of the multidimensional data cloud, this is followed by clustering. In many past case studies we carried out such clustering, using the Baire, or longest common prefix, metric and, simultaneously, ultrametric. In this paper, we examine properties of the seriation, and of the induction of the clustering on the data summarization, through seriation. Simulations are described as well as a small, illustrative example using Fisher’s iris data.

Zugehörige Institution(en) am KIT Institut für Informationswirtschaft und Marketing (IISM)
Publikationstyp Zeitschriftenaufsatz
Jahr 2017
Sprache Englisch
Identifikator DOI: 10.5445/KSP/1000058749/08
ISSN: 2363-9881
URN: urn:nbn:de:swb:90-669259
KITopen ID: 1000066925
Erschienen in Archives of Data Science Series A (Online First)
Band 2
Heft 1
Seiten 16 S. online
Lizenz CC BY-SA 4.0: Creative Commons Namensnennung – Weitergabe unter gleichen Bedingungen 4.0 International
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft KITopen Landing Page