Probabilistic Two-way Clustering Approaches with Emphasis on the Maximum Interaction Criterion

Bock, Hans-Hermann

doi:10.5445/KSP/1000058747/01

Probabilistic Two-way Clustering Approaches with Emphasis on the Maximum Interaction Criterion

Bock, Hans-Hermann

Abstract:

We consider the problem of simultaneously and optimally clustering the rows and columns of a real-valued I x J data matrix X = (xi j) by corresponding row and columns partitions A = (A1; :::;Am) and B = (B1; :::;Bn), with given m and n. We emphasize the need to base the clustering method on a probabilistic model for the data and then to use standard methods from statistics (e.g., maximum likelihood, divergence) to characterize optimum two-way classifications. We survey some clustering criteria and algorithms proposed in the literature for various data types. Special emphasis is given to the maximum interaction clustering criterion proposed by the author in 1980. It can be shown that it results as the maximum likelihood clustering method under a two-way ANOVA model (with individual main effects, but cluster-specific interactions). After a simple data transformation (double-centering) well-known two-way SSQ clustering algorithms can directly be used for maximization.

KITopen-Download

Volltext

DOI: 10.5445/KSP/1000058747/01

Export

Statistiken

Seitenaufrufe: 517
seit 25.04.2018

Downloads: 233
seit 22.03.2017

Zugehörige Institution(en) am KIT	Fakultät für Wirtschaftswissenschaften – Institut für Informationswirtschaft und Marketing (IISM)
Publikationstyp	Zeitschriftenaufsatz
Publikationsjahr	2016
Sprache	Englisch
Identifikator	ISSN: 2363-9881 urn:nbn:de:swb:90-677594 KITopen-ID: 1000067759
Erschienen in	Archives of Data Science, Series A
Band	1
Heft	1
Seiten	3-20

Repository KITopen

Probabilistic Two-way Clustering Approaches with Emphasis on the Maximum Interaction Criterion

Abstract: