KIT | KIT-Bibliothek | Impressum | Datenschutz

Yet another distributional Bellman equation

Bäuerle, Nicole ORCID iD icon 1; Göll, Tamara ORCID iD icon 1; Jaśkiewicz, Anna
1 Institut für Stochastik (STOCH), Karlsruher Institut für Technologie (KIT)

Abstract:

We consider non-standard Markov Decision Processes (MDPs) where the target function is not only a simple expectation of the accumulated reward. Instead, we consider rather general functionals of the joint distribution of terminal state and accumulated reward which have to be optimized. For finite state and compact action space, we show how to solve these problems by defining a lifted MDP whose state space is the space of distributions over the true states of the process. We derive a Bellman equation in this setting, which can be considered as a distributional Bellman equation. Well-known cases like the standard MDP and quantile MDPs are shown to be special examples of our framework. We also apply our model to a variant of an optimal transport problem.


Verlagsausgabe §
DOI: 10.5445/IR/1000191592
Veröffentlicht am 23.03.2026
Originalveröffentlichung
DOI: 10.1016/j.ejor.2026.03.010
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Stochastik (STOCH)
Publikationstyp Zeitschriftenaufsatz
Publikationsmonat/-jahr 03.2026
Sprache Englisch
Identifikator ISSN: 0377-2217
KITopen-ID: 1000191592
Erschienen in European Journal of Operational Research
Verlag Elsevier
Schlagwörter Dynamic programming, Markov decision process, Bellman equation, Non-Markovian process
Nachgewiesen in OpenAlex
KIT – Die Universität in der Helmholtz-Gemeinschaft
KITopen Landing Page