Decision Trees for the Imputation of Categorical Data

Rockel, Tobias; Joenssen, Dieter William; Bankhofer, Udo

doi:10.5445/KSP/1000058749/14

Decision Trees for the Imputation of Categorical Data

Rockel, Tobias; Joenssen, Dieter William; Bankhofer, Udo

Abstract:

Resolving the problem of missing data via imputation can theoretically be done by any prediction model. In the field of machine learning, a well known type of prediction model is a decision tree. However, the literature on how suitable a decision tree is for imputation is still scant to date. Therefore, the aim of this paper is to analyze the imputation quality of decision trees. Furthermore, we present a way to conduct a stochastic imputation using decision trees. We ran a simulation study to compare the deterministic and stochastic imputation approach using decision trees among each other and with other imputation methods. For this study, real datasets and various missing data settings are used. In addition, three different quality criteria are considered. The results of the study indicate that the choice of imputation method should be based on the intended analysis.

KITopen-Download

Volltext

DOI: 10.5445/KSP/1000058749/14

Export

Statistiken

Seitenaufrufe: 856
seit 29.04.2018

Downloads: 1507
seit 12.04.2017

Zugehörige Institution(en) am KIT	Fakultät für Wirtschaftswissenschaften – Institut für Informationswirtschaft und Marketing (IISM)
Publikationstyp	Zeitschriftenaufsatz
Publikationsjahr	2017
Sprache	Englisch
Identifikator	ISSN: 2363-9881 urn:nbn:de:swb:90-687708 KITopen-ID: 1000068770
Erschienen in	Archives of Data Science, Series A (Online First)
Band	2
Heft	1
Seiten	15 S. online
Nachgewiesen in	OpenAlex
Globale Ziele für nachhaltige Entwicklung

Repository KITopen

Decision Trees for the Imputation of Categorical Data

Abstract: