Willkommen bei Scribd!

Finite MDP

Hochgeladen von

0% fanden dieses Dokument nützlich (0 Abstimmungen)

16 Ansichten11 Seiten

Finite-Horizon Markov Decision Processes Dan Zhang Leeds School of Business University of Colorado at Boulder. Starting at a state s, using policy p leads to a sequence of state-action pairs Xt, Yt . The expected total rewards from policy p starting in state s is given by N -1 p vN (s ) [?] Ep s t =1.

Originalbeschreibung:

Originaltitel

Finite Mdp

Copyright

Verfügbare Formate

PDF, TXT oder online auf Scribd lesen

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Dieses Dokument melden

Copyright:

Attribution Non-Commercial (BY-NC)

Verfügbare Formate

Als PDF, TXT herunterladen oder online auf Scribd lesen

Markieren Sie unangemessene Inhalte

0% fanden dieses Dokument nützlich (0 Abstimmungen)

16 Ansichten11 Seiten

Finite MDP

Hochgeladen von

naoto_soma

Copyright:

Attribution Non-Commercial (BY-NC)

Verfügbare Formate

Als PDF, TXT herunterladen oder online auf Scribd lesen

Markieren Sie unangemessene Inhalte

Zu Seite

Sie sind auf Seite 1von 11

Im Dokument suchen

Finite-Horizon Markov Decision Processes

Dan Zhang Leeds School of Business University of Colorado at Boulder

Dan Zhang, Spring 2012

Finite Horizon MDP

Outline

Expected total reward criterion Optimality equations and the principle of optimality Optimality of deterministic Markov policies Backward induction Applications

Dan Zhang, Spring 2012

Finite Horizon MDP

Expected Total Reward Criterion

Let be a randomized history-dependent policy; i.e., HR .
= (d1 , . . . , dN 1 ) where dt : Ht P (A).

Starting at a state s , using policy leads to a sequence of state-action pairs {Xt , Yt }. The sequence of rewards is given by {Rt rt (Xt , Yt ) : t = 1, . . . , N 1} with terminal reward RN rN (XN ). The expected total rewards from policy starting in state s is given by
N 1 vN (s ) E s t =1

rt (Xt , Yt ) + rN (xN ) .

Dan Zhang, Spring 2012

Finite Horizon MDP

Optimal Policy

A policy is an optimal policy if

(s ), vN (s ) vN

s S , HR .

The value of a Markov decision problem is dened by

vN (s ) sup vN (s ), HR (s ) = v (s ) for all s S . We have vN N

s S .

Dan Zhang, Spring 2012

Finite Horizon MDP

Finite-Horizon Policy Evaluation

Let HR be a randomized history-dependent policy.

: H R be the total expected reward obtained by Let ut t using policy at decision epochs t , t + 1, . . . , N 1.

Given ht Ht for t < N , let

N 1 ut (ht ) = E ht n=t (h ) = r (s ) for h = (h Furthermore, let uN N N N N 1 , aN 1 , s ). (s ) = v (s ). For given initial state s , we have u1 N

rn (Xn , Yn ) + rN (XN ) .

Dan Zhang, Spring 2012

Finite Horizon MDP

The Finite-Horizon Policy Evaluation Algorithm

Assume HD .
1

(h ) = r (s ) for all Set t = N and uN N N N hN = (hN 1 , aN 1 , sN ) HN .

2 3

If t = 1, stop; otherwise go to step 3.

(h ) for each Substitute t 1 for t and compute ut t ht = (ht 1 , at 1 , st ) Ht by ut (ht ) = rt (st , dt (ht )) + j S pt (j |st , dt (ht ))ut +1 (ht , dt (ht ), j ).

Return to 2.

Dan Zhang, Spring 2012

Finite Horizon MDP

The Principle of Optimality

(h ) = sup Let ut t HR ut (ht ).

Consider the following optimality equations: ut (ht ) = sup rt (st , a) +

aAst j S

pt (j |st , a)ut +1 (ht , a, j ) ,

t = 1, . . . , N 1, ht = (ht 1 , at 1 , st ) Ht , uN (hN ) = rN (sN ), hN = (hN 1 , aN 1 , sN ) HN .

Dan Zhang, Spring 2012

Finite Horizon MDP

The Principle of Optimality

Theorem Suppose ut is a solution to the optimality equations for all t . Then

(h ) for all h H , t = 1, . . . , N ; (a) ut (ht ) = ut t t t (s ) for all s S . (b) u1 (s1 ) = vN 1 1

Dan Zhang, Spring 2012

Finite Horizon MDP

Optimality of Deterministic Markov Policies

Theorem be a solution to the optimality equations for all t . Then Let ut
(h ) depends on h only through s ; (a) For each t = 1, . . . , N , ut t t t

(b) If there exists an a Ast such that rt (st , a ) +

j S pt (j |st , a )ut +1 (ht , a , j )

= sup rt (st , a) +
aAst j S

pt (j |st , a)ut +1 (ht , a, j )

for each st S and t = 1, . . . , N 1, there exists an optimal policy which is deterministic and Markovian.

Dan Zhang, Spring 2012

Finite Horizon MDP

Backward Induction
(s ) = r (s ) for all s S . Set t = N and uN N N N N (s ) for each s S by Substitute t 1 for t and compute ut t t ut (st ) = max rt (st , a) + aAst j S pt (j |st , a)ut +1 (j ) .

1 2

Set A st ,t = arg max rt (st , a) +

aAst j S
3

pt (j |st , a)ut +1 (j ) .

If t = 1, stop; otherwise go to step 2.

Dan Zhang, Spring 2012

Finite Horizon MDP

e-Rite-Way: An MDP Formulation

Decision epochs: T = {1, 2, 3, 4, 5} States: S = {1, 2}. Actions: As = {0, 1, 2}.

0: Do nothing 1: Gift and minor price promotion 2: Gift and Major price promotion

Expected rewards: rt (s , a) (see handout). Terminal rewards: rN (s ) = 0. Transition probabilities:

a pt (i |s , a) = psi ,

i = 1, 2.

Dan Zhang, Spring 2012

Finite Horizon MDP

Das könnte Ihnen auch gefallen

7 Editing A Drumkit PDF
Dokument12 Seiten
7 Editing A Drumkit PDF
naoto_soma
Noch keine Bewertungen
Jazz Guitar Lessons Barry Harris Workshop Advanced Bebop Exercises
Dokument20 Seiten
Jazz Guitar Lessons Barry Harris Workshop Advanced Bebop Exercises
naoto_soma
100% (10)
Vba Can Theoretical Risk
Dokument4 Seiten
Vba Can Theoretical Risk
naoto_soma
Noch keine Bewertungen
Detalles de Evangelion 333 Que No Viste - Info - Taringa!
Dokument28 Seiten
Detalles de Evangelion 333 Que No Viste - Info - Taringa!
naoto_soma
Noch keine Bewertungen
STRIKES TWICE INTRO TAB by Larry Carlton at Ultimate-GuitarCom
Dokument2 Seiten
STRIKES TWICE INTRO TAB by Larry Carlton at Ultimate-GuitarCom
naoto_soma
Noch keine Bewertungen
If X Is A Limit Point of A Set S, Then Every Open Ball Centered at X Contains Infinitely Many Points of S?
Dokument1 Seite
If X Is A Limit Point of A Set S, Then Every Open Ball Centered at X Contains Infinitely Many Points of S?
naoto_soma
Noch keine Bewertungen
"Qqfoejy &Rvjwbmfodfcfuxffo%Fgjojujpoboe$Pspmmbszxjui5Tqbdf: X S X S X
Dokument1 Seite
"Qqfoejy &Rvjwbmfodfcfuxffo%Fgjojujpoboe$Pspmmbszxjui5Tqbdf: X S X S X
naoto_soma
Noch keine Bewertungen
Columbia University: Department of Economics Discussion Paper Series
Dokument26 Seiten
Columbia University: Department of Economics Discussion Paper Series
naoto_soma
Noch keine Bewertungen
Problem Set 1: Nash's Theorem
Dokument4 Seiten
Problem Set 1: Nash's Theorem
naoto_soma
Noch keine Bewertungen
Syllabus ECON 52
Dokument4 Seiten
Syllabus ECON 52
naoto_soma
Noch keine Bewertungen
The Review of Economic Studies, LTD
Dokument19 Seiten
The Review of Economic Studies, LTD
naoto_soma
Noch keine Bewertungen
Advanced Macroeconomics (Minford) PDF
Dokument517 Seiten
Advanced Macroeconomics (Minford) PDF
naoto_soma
Noch keine Bewertungen
Convexdualitychapter PDF
Dokument214 Seiten
Convexdualitychapter PDF
naoto_soma
Noch keine Bewertungen
The Yellow House: A Memoir (2019 National Book Award Winner)
Von Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
Bewertung: 4 von 5 Sternen
4/5 (98)
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Von Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Bewertung: 4 von 5 Sternen
4/5 (5795)
Shoe Dog: A Memoir by the Creator of Nike
Von Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
Bewertung: 4.5 von 5 Sternen
4.5/5 (537)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Von Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
Bewertung: 4.5 von 5 Sternen
4.5/5 (474)
Grit: The Power of Passion and Perseverance
Von Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
Bewertung: 4 von 5 Sternen
4/5 (588)
On Fire: The (Burning) Case for a Green New Deal
Von Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
Bewertung: 4 von 5 Sternen
4/5 (74)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Von Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
Bewertung: 3.5 von 5 Sternen
3.5/5 (231)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Von Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
Bewertung: 4 von 5 Sternen
4/5 (895)
Never Split the Difference: Negotiating As If Your Life Depended On It
Von Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
Bewertung: 4.5 von 5 Sternen
4.5/5 (838)
The Little Book of Hygge: Danish Secrets to Happy Living
Von Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
Bewertung: 3.5 von 5 Sternen
3.5/5 (400)
Principles: Life and Work
Von Everand
Principles: Life and Work
Ray Dalio
Bewertung: 4 von 5 Sternen
4/5 (599)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Von Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
Bewertung: 4.5 von 5 Sternen
4.5/5 (345)
Yes Please
Von Everand
Yes Please
Amy Poehler
Bewertung: 4 von 5 Sternen
4/5 (1891)
The Unwinding: An Inner History of the New America
Von Everand
The Unwinding: An Inner History of the New America
George Packer
Bewertung: 4 von 5 Sternen
4/5 (45)
Team of Rivals: The Political Genius of Abraham Lincoln
Von Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
Bewertung: 4.5 von 5 Sternen
4.5/5 (234)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Von Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
Bewertung: 3.5 von 5 Sternen
3.5/5 (2259)
Angela's Ashes: A Memoir
Von Everand
Angela's Ashes: A Memoir
Frank McCourt
Bewertung: 4.5 von 5 Sternen
4.5/5 (440)
Steve Jobs
Von Everand
Steve Jobs
Walter Isaacson
Bewertung: 4.5 von 5 Sternen
4.5/5 (806)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Von Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
Bewertung: 4.5 von 5 Sternen
4.5/5 (266)
The Emperor of All Maladies: A Biography of Cancer
Von Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
Bewertung: 4.5 von 5 Sternen
4.5/5 (271)
John Adams
Von Everand
John Adams
David McCullough
Bewertung: 4.5 von 5 Sternen
4.5/5 (2409)
Rise of ISIS: A Threat We Can't Ignore
Von Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
Bewertung: 3.5 von 5 Sternen
3.5/5 (137)
Fear: Trump in the White House
Von Everand
Fear: Trump in the White House
Bob Woodward
Bewertung: 3.5 von 5 Sternen
3.5/5 (738)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Von Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
Bewertung: 4 von 5 Sternen
4/5 (1090)
Bad Feminist: Essays
Von Everand
Bad Feminist: Essays
Roxane Gay
Bewertung: 4 von 5 Sternen
4/5 (1016)
The Glass Castle: A Memoir
Von Everand
The Glass Castle: A Memoir
Jeannette Walls
Bewertung: 4.5 von 5 Sternen
4.5/5 (1713)
The Outsider: A Novel
Von Everand
The Outsider: A Novel
Stephen King
Bewertung: 4 von 5 Sternen
4/5 (1839)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Von Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
Bewertung: 4.5 von 5 Sternen
4.5/5 (121)
A Man Called Ove: A Novel
Von Everand
A Man Called Ove: A Novel
Fredrik Backman
Bewertung: 4.5 von 5 Sternen
4.5/5 (4610)
The Woman in Cabin 10
Von Everand
The Woman in Cabin 10
Ruth Ware
Bewertung: 3.5 von 5 Sternen
3.5/5 (2322)
The Light Between Oceans: A Novel
Von Everand
The Light Between Oceans: A Novel
M.L. Stedman
Bewertung: 4.5 von 5 Sternen
4.5/5 (789)
Wolf Hall: A Novel
Von Everand
Wolf Hall: A Novel
Hilary Mantel
Bewertung: 4 von 5 Sternen
4/5 (3811)
Brooklyn: A Novel
Von Everand
Brooklyn: A Novel
Colm Tóibín
Bewertung: 3.5 von 5 Sternen
3.5/5 (1937)
The Perks of Being a Wallflower
Von Everand
The Perks of Being a Wallflower
Stephen Chbosky
Bewertung: 4.5 von 5 Sternen
4.5/5 (2104)
The Art of Racing in the Rain: A Novel
Von Everand
The Art of Racing in the Rain: A Novel
Garth Stein
Bewertung: 4 von 5 Sternen
4/5 (4200)
Little Women
Von Everand
Little Women
Louisa May Alcott
Bewertung: 4 von 5 Sternen
4/5 (104)
Manhattan Beach: A Novel
Von Everand
Manhattan Beach: A Novel
Jennifer Egan
Bewertung: 3.5 von 5 Sternen
3.5/5 (792)
Her Body and Other Parties: Stories
Von Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
Bewertung: 4 von 5 Sternen
4/5 (821)
Sing, Unburied, Sing: A Novel
Von Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
Bewertung: 4 von 5 Sternen
4/5 (1103)
A Tree Grows in Brooklyn
Von Everand
A Tree Grows in Brooklyn
Betty Smith
Bewertung: 4.5 von 5 Sternen
4.5/5 (1929)
The Constant Gardener: A Novel
Von Everand
The Constant Gardener: A Novel
John le Carré
Bewertung: 3.5 von 5 Sternen
3.5/5 (104)
V2X Survey
Dokument37 Seiten
V2X Survey
Aniket pandit
Noch keine Bewertungen
ReactivepowerComp usingSTATCOM
Dokument126 Seiten
ReactivepowerComp usingSTATCOM
sreddy4svu
Noch keine Bewertungen
Cat CDVR
Dokument8 Seiten
Cat CDVR
Adeel Afzal
Noch keine Bewertungen
Ebook Kubernetes PDF
Dokument26 Seiten
Ebook Kubernetes PDF
mostafahassan
Noch keine Bewertungen
Group 4: Prachi Agarwal, Kriti Sharan, Sumit Naugraiya, Sumit Puri, Vishnu Sharma & Govind Daga
Dokument53 Seiten
Group 4: Prachi Agarwal, Kriti Sharan, Sumit Naugraiya, Sumit Puri, Vishnu Sharma & Govind Daga
kritisharan
Noch keine Bewertungen
Underwater ROV Control System
Dokument20 Seiten
Underwater ROV Control System
PhanNam
Noch keine Bewertungen
4810 Purchasing and Materials Management Exam #1
Dokument20 Seiten
4810 Purchasing and Materials Management Exam #1
DrSamia El Wakil
Noch keine Bewertungen
2023 Euc01a1 Learner Guide Fina
Dokument21 Seiten
2023 Euc01a1 Learner Guide Fina
Case Clos
Noch keine Bewertungen
Handle Inventory Management BW - Easy Steps
Dokument13 Seiten
Handle Inventory Management BW - Easy Steps
John Barrero
100% (1)
GSL Hi-Temp Operation Manual-0217-Gl - Tube Furnace
Dokument36 Seiten
GSL Hi-Temp Operation Manual-0217-Gl - Tube Furnace
hca
Noch keine Bewertungen
Datasheet Axis q1656 Le Box Camera en US 424540
Dokument3 Seiten
Datasheet Axis q1656 Le Box Camera en US 424540
petereriksson.rosenfors
Noch keine Bewertungen
SQL CC
Dokument457 Seiten
SQL CC
Manish Singh
Noch keine Bewertungen
Power Plant Manual 1
Dokument8 Seiten
Power Plant Manual 1
Khoirul Walad
Noch keine Bewertungen
VFD Manual PDF
Dokument60 Seiten
VFD Manual PDF
ray1co
Noch keine Bewertungen
Security SY0 501 Study Guide
Dokument39 Seiten
Security SY0 501 Study Guide
re
Noch keine Bewertungen
ZET 8 Astrology Software: Welcome Astrological Feature List
Dokument254 Seiten
ZET 8 Astrology Software: Welcome Astrological Feature List
Leombruno Blue
Noch keine Bewertungen
ERP Overview
Dokument29 Seiten
ERP Overview
LupsoiuCosmin
100% (1)
Robotic Process Automation
Dokument6 Seiten
Robotic Process Automation
Tanushree Kalita
Noch keine Bewertungen
Spatial Analysis
Dokument91 Seiten
Spatial Analysis
chiradzulu
83% (6)
Dbms Manual
Dokument144 Seiten
Dbms Manual
ER Jagdeep Gurjar
Noch keine Bewertungen
XW PRO UL9540 Energy Storage Systems Information
Dokument7 Seiten
XW PRO UL9540 Energy Storage Systems Information
Mujeeb Ur Rehman Khalil
Noch keine Bewertungen
Welcome To The Next Lecture On Design For Modularity. (Refer Slide Time: 00:18)
Dokument19 Seiten
Welcome To The Next Lecture On Design For Modularity. (Refer Slide Time: 00:18)
SURESH S
Noch keine Bewertungen
How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis
Dokument22 Seiten
How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis
Fake name
Noch keine Bewertungen
ProjectWork Zaripova 2017
Dokument57 Seiten
ProjectWork Zaripova 2017
hieu
Noch keine Bewertungen
Decimal To Artnet Conversion Table
Dokument2 Seiten
Decimal To Artnet Conversion Table
jray1982
Noch keine Bewertungen
The Concept of "At Work"
Dokument17 Seiten
The Concept of "At Work"
Gerel Ann Lapatar Tranquillero
Noch keine Bewertungen
Maintaining Bim Integrity in Structural Engineering Office
Dokument0 Seiten
Maintaining Bim Integrity in Structural Engineering Office
rizviabbas2012
Noch keine Bewertungen
Igcse Ict Lessonplan 1
Dokument3 Seiten
Igcse Ict Lessonplan 1
Chaima Ahmed Gaid
Noch keine Bewertungen
Keyboard Shortcuts Windows 7 and 8
Dokument3 Seiten
Keyboard Shortcuts Windows 7 and 8
Feral Erratic
Noch keine Bewertungen
Information Security Policy
Dokument16 Seiten
Information Security Policy
mohammed oumer
Noch keine Bewertungen