KIT | KIT-Bibliothek | Impressum | Datenschutz

Probabilistic Two-way Clustering Approaches with Emphasis on the Maximum Interaction Criterion

Bock, Hans-Hermann

Abstract:

We consider the problem of simultaneously and optimally clustering the rows and columns of a real-valued I x J data matrix X = (xi j) by corresponding row and columns partitions A = (A1; :::;Am) and B = (B1; :::;Bn), with given m and n. We emphasize the need to base the clustering method on a probabilistic model for the data and then to use standard methods from statistics (e.g., maximum likelihood, divergence) to characterize optimum two-way classifications. We survey some clustering criteria and algorithms proposed in the literature for various data types. Special emphasis is given to the maximum interaction clustering criterion proposed by the author in 1980. It can be shown that it results as the maximum likelihood clustering method under a two-way ANOVA model (with individual main effects, but cluster-specific interactions). After a simple data transformation (double-centering) well-known two-way SSQ clustering algorithms can directly be used for maximization.


Volltext §
DOI: 10.5445/KSP/1000058747/01
Cover der Publikation
Zugehörige Institution(en) am KIT Fakultät für Wirtschaftswissenschaften – Institut für Informationswirtschaft und Marketing (IISM)
Publikationstyp Zeitschriftenaufsatz
Publikationsjahr 2016
Sprache Englisch
Identifikator ISSN: 2363-9881
urn:nbn:de:swb:90-677594
KITopen-ID: 1000067759
Erschienen in Archives of Data Science, Series A
Band 1
Heft 1
Seiten 3-20
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page