KIT | KIT-Bibliothek | Impressum | Datenschutz

Evaluating probabilistic classifiers: The triptych

Dimitriadis, Timo; Gneiting, Tilmann ORCID iD icon 1; Jordan, Alexander I.; Vogel, Peter
1 Institut für Stochastik (STOCH), Karlsruher Institut für Technologie (KIT)

Abstract:

Probability forecasts for binary outcomes, often referred to as probabilistic classifiers or confidence scores, are ubiquitous in science and society, and methods for evaluating and comparing them are in great demand. We propose and study a triptych of diagnostic graphics focusing on distinct and complementary aspects of forecast performance: Reliability curves address calibration, receiver operating characteristic (ROC) curves diagnose discrimination ability, and Murphy curves visualize overall predictive performance and value. A Murphy curve shows a forecast’s mean elementary scores, including the widely used misclassification rate, and the area under a Murphy curve equals the mean Brier score. For a calibrated forecast, the reliability curve lies on the diagonal, and for competing calibrated forecasts, the ROC and Murphy curves share the same number of crossing points. We invoke the recently developed CORP (Consistent, Optimally binned, Reproducible, and Pool-Adjacent-Violators (PAV) algorithm-based) approach to craft reliability curves and decompose a mean score into miscalibration (MCB), discrimination (DSC), and uncertainty (UNC) components. ... mehr


Verlagsausgabe §
DOI: 10.5445/IR/1000164797
Veröffentlicht am 28.11.2023
Originalveröffentlichung
DOI: 10.1016/j.ijforecast.2023.09.007
Scopus
Zitationen: 2
Dimensions
Zitationen: 5
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Stochastik (STOCH)
Publikationstyp Zeitschriftenaufsatz
Publikationsmonat/-jahr 07.2024
Sprache Englisch
Identifikator ISSN: 0169-2070, 1872-8200
KITopen-ID: 1000164797
Erschienen in International Journal of Forecasting
Verlag Elsevier
Band 40
Heft 3
Seiten 1101–1122
Vorab online veröffentlicht am 04.11.2023
Schlagwörter Calibration error, Economic utility, Logarithmic score, MCB–DSC plot, Misclassification loss, Proper scoring rule, Score decomposition, Sharpness principle
Nachgewiesen in Scopus
Dimensions
Relationen in KITopen
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page