Markov Decision Processes

Bäuerle, N.; Rieder, U.

doi:10.1365/s13291-010-0007-2

Markov Decision Processes

Bäuerle, N.

; Rieder, U.

Abstract:

The theory of Markov Decision Processes is the theory of controlled Markov chains. Its origins can be traced back to R. Bellman and L. Shapley in the 1950's. During the decades of the last century this theory has grown dramatically. It has found applications in various areas like e.g. computer science, engineering, operations research, biology and economics. In this article we give a short introduction to parts of this theory. We treat Markov Decision Processes with finite and infinite time horizon where we will restrict the presentation to the so-called (generalized) negative case. Solution algorithms like Howard's policy improvement and linear programming are also explained. Various examples show the application of the theory. We treat stochastic linear-quadratic control problems, bandit problems and dividend pay-out problems.

KITopen-Download

Volltext

DOI: 10.5445/IR/1000032907

Externe Links

Originalveröffentlichung
DOI: 10.1365/s13291-010-0007-2

Scopus

Dimensions
Zitationen: 11

Export

Statistiken

Seitenaufrufe: 148
seit 04.09.2018

Downloads: 988
seit 02.12.2014

Zugehörige Institution(en) am KIT	Institut für Stochastik (STOCH)
Publikationstyp	Zeitschriftenaufsatz
Publikationsjahr	2010
Sprache	Englisch
Identifikator	ISSN: 0012-0456 urn:nbn:de:swb:90-329075 KITopen-ID: 1000032907
Erschienen in	Jahresbericht der deutschen Mathematiker-Vereinigung (DMV)
Verlag	Springer
Band	112
Heft	4
Seiten	217-243
Schlagwörter	Markov Decision Process, Markov Chain, Bellman Equation, Policy Improvement, Linear Programming
Nachgewiesen in	Dimensions Scopus

Repository KITopen

Markov Decision Processes

Abstract: