KIT | KIT-Bibliothek | Impressum | Datenschutz
Open Access Logo
§
Verlagsausgabe
DOI: 10.5445/IR/1000085954
Veröffentlicht am 24.09.2018

A Maximum-Entropy Method to Estimate Discrete Distributions from Samples Ensuring Nonzero Probabilities

Darscheid, Paul; Guthke, Anneli; Ehret, Uwe

Abstract:
When constructing discrete (binned) distributions from samples of a data set, applications exist where it is desirable to assure that all bins of the sample distribution have nonzero probability. For example, if the sample distribution is part of a predictive model for which we require returning a response for the entire codomain, or if we use Kullback–Leibler divergence to measure the (dis-)agreement of the sample distribution and the original distribution of the variable, which, in the described case, is inconveniently infinite. Several sample-based distribution estimators exist which assure nonzero bin probability, such as adding one counter to each zero-probability bin of the sample histogram, adding a small probability to the sample pdf, smoothing methods such as Kernel-density smoothing, or Bayesian approaches based on the Dirichlet and Multinomial distribution. Here, we suggest and test an approach based on the Clopper–Pearson method, which makes use of the binominal distribution. Based on the sample distribution, confidence intervals for bin-occupation probability are calculated. The mean of each confidence interval is a str ... mehr


Zugehörige Institution(en) am KIT Institut für Wasser und Gewässerentwicklung (IWG)
Publikationstyp Zeitschriftenaufsatz
Jahr 2018
Sprache Englisch
Identifikator ISSN: 1099-4300
URN: urn:nbn:de:swb:90-859547
KITopen ID: 1000085954
Erschienen in Entropy
Band 20
Heft 8
Seiten Article: 601
Bemerkung zur Veröffentlichung Gefördert durch den KIT-Publikationsfonds
Vorab online veröffentlicht am 13.08.2018
Schlagworte histogram; sample; discrete distribution; empty bin; zero probability; Clopper–Pearson; maximum entropy approach
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft KITopen Landing Page