Beruflich Dokumente
Kultur Dokumente
Robert Stengel Optimal Control and Estimation MAE 546 Princeton University, 2012 ! ! ! ! ! Nonlinear systems with random inputs and perfect measurements Nonlinear systems with random inputs and imperfect measurements Certainty equivalence and separation Stochastic neighboring-optimal control Linear-quadratic-Gaussian (LQG) control
z (t ) = x (t )
! x ( t ) = f ! x ( t ) , u ( t ) , w ( t ) ,t # " $
}=0
Copyright 2012 by Robert Stengel. All rights reserved. For educational use only. http://www.princeton.edu/~stengel/MAE546.html http://www.princeton.edu/~stengel/OptConEst.html
! x ( t ) = f ! x ( t ) , u ( t ) ,t # + L ( t ) w ( t ) " $
tf
( ' H[x(t), u(t), !(t),t] + ! 2) E " !(t) $ = &E ) , # % 'x * # ! H[x(t), u(t), "(t),t] & 3) E $ '=0 !u % (
dim ! Tr ( ) # = 1 % 1 " $
* "V * "V * & dV * " 2V * 1 # ! E, + ( f (.) + Lw (.)) + 2 Tr % ( f (.) + Lw (.))T "x 2 ( f (.) + Lw (.))( )t / dt "x $ ' . + "t * "V * "V * & 1 # " 2V * + = E, ( f (.) + Lw (.)) + 2 Tr % "x 2 ( f (.) + Lw (.))( f (.) + Lw (.))T ( )t / "x $ ' . + "t
Uncertain disturbance input can only increase the value function rate of change
+ ( ! 2V * dV * !V * !V * 1 T T = + f (.) + lim Tr ) 2 $ E f (.) f (.) "t + LE w (.) w (.) LT & "t , ' "t #0 dt !t !x 2 * !x % = & !V * !V * 1 $! V * (t ) + (t ) f (.) + Tr . 2 (t ) L (t ) W (t ) L (t )T / !t !x 2 % !x '
2
! Control has no effect on the disturbance input ! Criterion for optimality is the same as for the deterministic case ! Disturbance uncertainty increases the magnitude of the total optimal value function, V*(0)
( )
( )
I MD ( t1 ) = {x ( t1 ) , P ( t1 ) , u ( t1 )}
! Multiple model derived information set
! Parallel estimates of current mean, covariance, and hypothesis probability mass function
I MM ( t1 ) = ! x A ( t1 ) , PA ( t1 ) , u ( t1 ) , Pr ( H A ) # , ! x B ( t1 ) , PB ( t1 ) , u ( t1 ) , Pr ( H B ) # ,! " $ " $
I !t o ,t f # " $
! Separate information set into knowable and predictable parts
E ! x (t ) | I D # = x (t ) " $
$" $ E ! x (t ) % x (t )# ! x (t ) % x (t )# | I D = P (t ) "
T
... where the conditional expected values are obtained from a Kalman-Bucy lter
( )
! J CE + J S
= E " x ( t ) xT ( t ) ! x ( t ) xT ( t ) ! x ( t ) xT ( t ) + x ( t ) xT ( t ) $ | I D # %
E ! x ( t ) xT ( t ) # | I D = E ! x ( t ) xT ( t ) # | I D = x ( t ) xT ( t ) " $ " $
}
}
}
J CE = JS =
} {
P ( t ) = E ! x ( t ) xT ( t ) # | I D % x ( t ) xT ( t ) " $ or E ! x ( t ) xT ( t ) # | I D = P ( t ) + x ( t ) xT ( t ) " $
( )
V * (to ) ! J * t f
( )
) {
! !
E ( J *) = E ( J* | I [ t o ,t1 ])[1] + E J* | I !t1 ,t f # Pr I !t1 ,t f # " $ " $ = E ( J* | I [ t o ,t1 ]) + E J* | I !t1 ,t f # Pr I !t1 ,t f # " $ " $
) {
) {
Noisy measurements Closed-loop therapy is robust ... but not robust enough:
Organ death occurs in one case
to ) + + V ( t o ) = E *! " x(t f ) $ & ( L [ x(' ), u(' )] d' . # % tf + + , / to ) " Q(t) M(t) $ " x(t) $ 1 + + 10 = E *xT (t f )S(t f )x(t f ) & ( " xT (t) uT (t) $ 0 1 dt # % 0 MT (t) R(t) 1 0 u(t) 1 . 2 + tf + % / %# # ,
V (t ) =
1 T x (t)S(t)x(t) + v ( t ) 2
VCE ( t ) !
1 T x (t)S(t)x(t) 2
VCE ( t ) !
1 T x (t)S(t)x(t) 2
!V (t ) = xT (t)S(t) !x
! Hessian with respect to the state
v (t ) =
1 tf T Tr "S (! ) L (! ) W (! ) L (! ) $ d! # % 2 &t
! 2V (t ) = S(t) !x 2
! ( !V !t ) = 0 = " xT M + uT R + xT SG $ # % !u
! Terminal condition
u ( t ) = !R !1 ( t ) "GT ( t ) S ( t ) + MT ( t ) $ x ( t ) # % ! !C ( t ) x ( t )
V tf =
( )
1 T x (t f )S(t f )x(t f ) 2
) (
Zero-mean, white-noise disturbance has no effect on the structure and gains of the LQ feedback control law
( )
( )
! v=
1 Tr SLWLT 2
! Q(t) 0 # ! x(t) # , 1 ( * * Tr ) E ! xT (t f )S(t f )x(t f ) # + E ' ! xT (t) uT (t) # % & dt &% $ " $% 0 R(t) $ " u(t) $ * 2 * " & . &% to " +
tf
1 Tr S(t f )E ! x(t f )xT (t f ) # + ' Q(t)E ! x(t)xT (t) # + R(t)E ! u(t)uT (t) # dt " $ " $ " $ 2 to
tf
u ( t ) = !C ( t ) x ( t )
! Optimal control covariance
J=
U ( t ) = C ( t ) P ( t ) CT ( t )
= R !1 ( t ) GT ( t ) S ( t ) P ( t ) S ( t ) G ( t ) R !1 ( t )
where
tf
tf
! P ( t ) = F ( t ) P ( t ) + P ( t ) FT ( t ) + L ( t ) W ( t ) LT ( t ) , P ( t o ) given
! Adjoint covariance response to terminal cost
! S ( t ) = !FT ( t ) S ( t ) ! S ( t ) F ( t ) ! Q ( t ) , S t f
( )
given
J no control
Dependent on S(t)
! With optimal control, the equation for the cost is the same
tf % 1 " = Tr $S(t o )P ( t o ) + ! S ( t ) L ( t ) W ( t ) LT ( t ) dt ' 2 # ' $ to &
! S ( t ) = !FT ( t ) S ( t ) ! S ( t ) F ( t ) ! Q ( t ) ! S ( t ) G ( t ) R !1 ( t ) GT ( t ) S ( t )
Independent of P(t)
J optimal control
! ... but evolutions of S(t) and S(to) are different in each case
Supplemental Material
Dual Control
(Fel"dbaum, 1965) ! Nonlinear system
! Uncertain system parameters to be estimated ! Parameter estimation can be aided by test inputs
u( t ) = c[ x(t),a,y * ( t )]
On-line adaptive critic controller
Nonlinear control law (action network) Criticizes non-optimal performance via critic network
Adapts control gains to improve performance Adapts cost model to improve estimate
Critic adapts neural network weights to improve performance using approximate dynamic programming
xa(t) a(t)
NNA
Aircraft Model
Transition Matrices State Prediction
!V [x a (t )] = NNC[x a (t ),a (t )] !x a (t )
Target Generation
xa(t) a(t)
NNA
Aircraft Model
Transition Matrices State Prediction
NNC
NNC Target
Target Generation