KIT | KIT-Bibliothek | Impressum | Datenschutz

Forests of Stumps

Alharthi, Amirah; Taylor, Charles C.; Voss, Jochen

Abstract:

Many numerical studies (Hansen and Salamon (1990), Schapire (1990)) indicate that bagged decision stumps perform more accurately than a single stump. In this work, we will investigate two approaches to create a forest of stumps for classification. The first method is bagging with stumps, that is growing a stump on different bootstrap sample size drawn from the training dataset. The second method is Gini-sampled stumps, where we sample split points with probability proportional to the Gini index. These two methods are combined with two aggregation methods: Majority vote and weighted vote. We use simulation studies to compare the performance and consumed time for these two methods. The computing time of generating split points by Gini-sampled stumps is less than half of the time needed to generate split points from bootstrap samples. Also, weighted vote aggregation results in more accurate performance than majority vote aggregation.


Verlagsausgabe §
DOI: 10.5445/KSP/1000087327/30
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Wirtschaftsinformatik und Marketing (IISM)
Publikationstyp Zeitschriftenaufsatz
Publikationsjahr 2018
Sprache Englisch
Identifikator ISSN: 2363-9881
KITopen-ID: 1000133322
Erschienen in Archives of Data Science, Series A
Band 5
Heft 1
Seiten P30, 22 S. online
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page