Академический Документы
Профессиональный Документы
Культура Документы
Le
ture Notes
February 9, 2016
Preliminary Edition
Page II of IV
Preamble
These notes have been prepared for a basi
ourse in optimal
ontrol at the 8th term of studies
in Ele
trioni
s and Information Te
hnology, Aalborg University. The students are expe
ted
to be a
quainted with
lassi
al feedba
k
ontrol theory.
Key Words
Optimal
ontrol. Ri
ati equation. Linear Quadrati
Gaussianl, Kalman observer, Stability;
Optimal Control
Page III of IV
Contents
1
Introdu
tion
1.1
. . . . . . . . . . . . . . . . . . . . . .
Dynami Programming
3.1
3.2
. . . . . . . . . . . . . . . . . . . .
12
3.3
15
3.4
16
3.5
. . . . . . . . . . . . . . . . . . .
Stationary LQ-Controllers
L(k)
17
19
S(k)
4.1
4.2
. . . . . . . . . . . . . . . .
20
4.3
24
and
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . .
19
27
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
31
6.1
35
6.2
37
. . . . . . . . . . . . . . . . . .
39
42
8.1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
42
8.2
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
42
8.3
43
8.4
45
8.5
8.6
L(k)
and
S(k)
. . . . . . . . . . . . . . . . . .
45
Summary of LQ Method for sto hasti , dis rete time systems with omplete state information
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
48
49
9.1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
9.2
Separation theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
50
Optimal Control
Page IV of IV
9.3
Summary of LQ method for sto
hasti
, dis
rete time systems with in
omplete state
information. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
51
9.4
. . . . . . . . . . . . . . . . . . . . . . . . . .
53
9.5
Innovation model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
54
55
55
58
. . . . . . . . . . . . . . . . . . . . . . . . . . .
Optimal Control
Chapter 1
Introdu
tion
In this
ourse we will explore a method for design of
ontrollers in whi
h the goals of the
ontrol are formulated using a performan
e fun
tion whi
h quantify obje
tives. Typi
al goals
an be to keep the plant state or output
lose to a referen
e by the use of as small
ontrol
signals as possible. In the performan
e (or
ost) fun
tion su
h goals are quantied dening
a
ost for deviations from setpoint and a
ost for the use of
ontrol signal. Having dened
the performan
e fun
tion the best
ontrol signal over time may be found by minimizing this
performan
e fun
tion. This way a
ontrol law relating the
ontrol signal from the
ontroller
to signals measured in the plant may be synthesized from the performan
e fun
tion.
The systems we will
onsider will primarily be linear systems des
ribed in state spa
e. The
theory will primarily be developed for dis
rete time
ontrol, although we will also give formulas
for the
ontinuous time
ase.
Most of you are familiar with methods for design of
ontrollers based on spe
i
ation of
desired
losed loop poles. You may even have experien
ed, that it
an be a di
ult task to
spe
ify a sensible set of desired poles. For instan
e if you spe
ify a set of poles it may turn
out, that the resulting
ontroller gives
ontrol signals whi
h ex
eed the physi
al bounds even
for reasonable disturban
es and referen
e signals.
If the plant you are going to
ontrol has multiple inputs and outputs you will also dis
over
that spe
i
ation of
losed loop poles is not su
ient to determine
ontroller
oe
ients.
If for instan
e you want to
ontrol a plant des
ribed by
(1.1)
(1.2)
y(k) = Hx(k)
In
ase the plant has two
ontrol inputs, two outputs and four states:
x1 (k)
u1 (k)
y1 (k)
x2 (k)
u(k) =
y(k) =
x(k) =
x3 (k)
u2 (k)
y2 (k)
x4 (k)
(1.3)
(1.4)
Page 2 of 61
with
L=
(1.5)
This feedba
k law results in a system whi
h in
losed loop
an be des
ribed by the equation
(1.6)
(1.7)
y(k + 1) = Hx(k)
The
losed loop poles of this system are the eigenvalues of the matrix ( L). The
losed
loop poles
an be pla
ed at the desired lo
ation if (, ) is
ontrollable. Controllability of
dis
rete time system
an be investigated by studying where you
an steer the state ve
tor of
the system using
ontrol signals at n su
essive sample instants. This results in
x(k + 1)
x(k + 2)
=
=
(1.8)
x(k) + u(k)
2
(1.9)
(1.10)
u(k + n 1)
...
n x(k) + [, , 2 , . . . , n1 ]
u(k + 1)
u(k)
(1.11)
(1.12)
The plant is
ontrollable if you
an rea
h the full state spa
e. We dene the
ontrollability
matrix
C = [, , 2 , . . . , n1 ]
(1.13)
If C has full rank, i.e. the rank is equal to the order, n, of the system, the system is
ontrollable.
Spe
i
ation of the n = 4
losed loop poles of the system above is not su
ient for instan
e
to determine the p n = 2 4
oe
ients of the state feedba
k matrix L of a plant with
p = 2 inputs and n = 4 state variables. The
losed loop poles
an be pla
ed at the desired
lo
ation for a multitude of L values if (, ) is
ontrollable.
Optimal Control oers design methods whi
h instead of fo
using on pole lo
ation use optimization to minimize a performan
e fun
tion whi
h ideally may be related dire
tly to the
goals of
ontrol task. This setup
an used for MIMO as well as SISO systems. These methods
an be used for
Optimal Control
Page 3 of 61
(1.14)
If the system has n state variables and p inputs x(k) will be an n-dimensional ve
tor and u(k)
will b a p-dimensional ve
tor, G is a ve
tor valued fun
tion of dimension n.
In the rst setting it is supposed that
I=
N
X
k=0
H(x(k), u(k))
(1.15)
where k is the sampling number. In these notes optimal
ontrol problems are formulated
as minimization problem, su
h that I is a quantity we want to minimize (is of
ourse also
possible to formulate performan
e fun
tions whi
h should be maximized). The problem is
now to nd a sequen
e of
ontrol signals u(k), k = 0, 1, 2, , N , whi
h minimizes I with
x(k) determined by the state equation.
It is in the form and parameters of H the weighting of large
ontrol signals versus large states is
determined. For the problem to be tra
table you should usually take
are that H(x(k), u(k))
is a
onvex fun
tion of x(k) and u(k). Often the goal will be to bring the state fast from an
initial value to origo with as small an amount of
ontrol eort as possible.
In this way the design of a
ontrol law
an be reformulated as a
hoi
e of a suitable performan
e
fun
tion followed by solution an optimization problem.
In the
hoi
e of performan
e we have several degrees of freedom:
1. The stru
ture of H. In this
ourse we will only
onsider H as a quadrati
fun
tion in
u(k) and x(k). In prin
iple several others
ould be thought of, some of these are being
explored in
urrent resear
h.
2. The weighting between x(k) and u(k). It must be determined how you wish to weight
large states versus large
ontrol signals, whi
h may be seen as good
ontrol with small
states versus
heap
ontrol with small
ontrol eort. Through
areful weighting it is also
possible to allow use of
ertain elements in u more than others, and allow deviations in
some state variables more than others.
Optimal Control
Page 4 of 61
3. The
hoi
e of N. The time horizon N will determine the weighting of the long term
steady state performan
e (large N) versus short term dynami
performan
e (small N).
The
hoi
e of performan
e fun
tion will be
ru
ial to the behavior of the
ontrolled system, and
it is important to see the minimization of the performan
e fun
tion as tool to obtain a good
ontroller. There is no
ontroller whi
h is optimal in any absolute sense. A
ontroller whi
h is
optimal for one performan
e fun
tion will not be optimal for another. So with optimization the
designer will have to tune the
ontroller by tuning the parameters of a performan
e fun
tion
instead of tuning the
ontroller parameters dire
tly or tuning the position of the
losed loop
poles.
At rst we will
onsider the problem of nding a
ontrol sequen
e u(k), k = 0, 1, 2, , N ,
whi
h minimizes a given performan
e fun
tion with an initial value of the state x(0). This
an be seen as an open loop problem.
After having solved this problem we will su
essively expand the method to be used in more
pra
ti
al
losed loop problems, see gure 1.1 .
d1
r
Use of
performance
function
State space
dynamics
d2
x
static
output
relation
Figure 1.1:
Optimal Control
(1.16)
(1.17)
Page 5 of 61
Performan
e fun
tion (dis
rete time)
I=
N
X
(1.18)
k=0
Control law
(1.19)
u(k) = L(k)x(k)
The linear
ontroller whi
h minimize a quadrati
performan
e fun
tion is often
alled Linear
Quadrati
ontrol (LQ) or Linear Quadrati
Regulator (LQR). If the pro
ess is sto
hasti
and
the states are not dire
tly measurable the states may be estimated using an observer designed
to minimize the estimation error, that is a Kalman lter. With this observer the
ontroller is
alled Linear Quadrati
Gaussian (LQG).
d1
r
Use of
performance
function
State space
dynamics
d2
x
static
output
relation
x
State
estimator
Figure 1.2:
Optimal Control
Chapter 2
Dynami
Programming
Dynami
programming is a prin
iple whi
h breaks
omplex de
isions into in a series of simpler
de
isions. We will use this idea to nd a sequen
e of
ontrol inputs whi
h minimize the
performan
e fun
tion.
The optimization takes advantage of the fa
t, that
a
ontrol strategy whi
h is optimal in the interval of samples [0; N ] must also be optimal in
any interval [k; N ] with 0 k N .
This is true sin
e if it was possible to improve the performan
e in the interval [k; N ] this
would also improve the performan
e in the entire interval [0; N ]
Based on this we will split the optimization up. We will introdu
e the notation
I0N
=
=
N
X
H(x(k), u(k))
k=0
I0N (x(0), u(0), u(1), . . .
, u(N ))
(2.1)
Indi
ating that the performan
e is dependent of the initial state ve
tor and the sequen
e of
ontrol signals.This is sensible sin
e x(1) is determined from x(0) and u(0) et
.
We will also use the notation
(2.2)
for the obtainable minimum of the performan e fun tion. We will also onsider the ontribution to the performan e fun tion from a part of the interval:
IkN
N
X
i=k
H(x(i), u(i))
(2.3)
Page 7 of 61
We will now determine the minimal performan
e
ontribution from the last part of the interval
JkN (x(k)) =
min
u(k),...,u(N )
= min[H(x(k), u(k)) +
u(k)
= min[H(x(k), u(k)) +
u(k)
min
u(k+1),...,u(N )
N
(x(k
Jk+1
N
X
i=k+1
H(x(i), u(i))]
+ 1))]
(2.4)
Noti
e that in the last expression x(k + 1) will depend on u(k) through the plant equation.
This may be summarized in the following algorithm
STEP i: JNNi = minu(N i) [H(x(N i), u(N i)) + JNNi+1 (x(N i + 1))]
We will
all the minimizing
ontrol signal u (N i)
(2.5)
(2.6)
(2.7)
(2.8)
I=
k=0
We hose N = 2
STEP 0:
u (2) = 0
(2.9)
will be set to zero sin
e it does not inuen
e any x in the performan
e therefore
any nonzero value will in
rease the performan
e fun
tion
u (2)
STEP 1:
(2.10)
dJ12 (x(1))
= 2qu(1) + 2(ax(1) + bu(1))b = 0
du(1)
(2.11)
Optimal Control
Page 8 of 61
u (1) =
ab
x(1)
q + b2
J12 (x(1)) = (1 + q
(2.12)
a2
)x2 (1)
q + b2
(2.13)
STEP 2:
J02 (x(0)) = minu(0) [x2 (0) + qu2 (0) + (1 + q
a2
)(ax(0) + bu(0))2 ]
q + b2
(2.14)
u (0) =
a
ab(1 + q q+b
2)
2
a
q + b2 (1 + q q+b
2)
x(0)
(2.15)
J02 (x(0))
= (1 + q
a
a2 (1 + q q+b
2)
q+
b2 (1
a2
q q+b
2)
)x2 (0)
(2.16)
u (0) =
u (1) =
a
ab(1 + q q+b
2)
2
a
q + b2 (1 + q q+b
2)
(2.17)
x(0)
ab
x(1)
q + b2
(2.18)
2
a
ab(1 + q q+b
2)
ab
=
)x(0)
(a
b
a2
q + b2
q + b2 (1 + q q+b
2)
(2.19)
(2.20)
u (2) = 0
(2.21)
The optimization results may be iterpreted and applied in two ways
1. With a known initial state we
an
al
ulate in advan
e the entire input sequen
e
u (0), u (1), , u (N )
and apply it in open loop to bring the plant from the initial state to zero
2. Use the
al
ulated gain L(k) (in general a matrix, in this simple example a s
alar) and
measured state x to
al
ulate
u (k) = L(k)x(k)
(2.22)
This is
losed loop
ontrol with a time varying gain (dependent on sample number
k), still with the purpose to bring the plant from an initial state to zero. This
losed
loop
ontrol is preferable be
ause open loop
ontrol is vulnerable to disturban
es and
un
ertainties in the model parameters.
The
ontrol is still very spe
ialized be
ause its only purpose is to bring the plant from an initial
state to zero. Later we will see that the
ontroller is extendable also to more realisti
ases
where referen
e and disturban
e signals are present, and where not all states are measurable.
Optimal Control
Chapter 3
Time varying LQ-Control
We will suppose that the system is deterministi
, we may not need to require that the system
is fully
ontrollable, but as minimum unstable modes of the system must be
ontrollable
using state feedba
k, that is the system is state feedba
k stabilizable. At this point we will
also suppose that all states
an be measured without error and that the input signal u(k) is
unlimited
The performan
e fun
tion is
hosen to be a quadrati
fun
tion of x(k) and u(k):
I=
N
1
X
k=0
In the terms introdu ed in the general form of the performan e fun tion we have hosen:
H(k) =
0k N 1
k=N
The matri
es QN , Q1 og Q2 are quadrati
and have the dimensions (n n), (n n) and
(p p). The matri
es may be interpreted as weight matri
es punishing large values of the
nal state, the
urrent state and the input
We will suppose that QN and Q1 are positive semidenite and that Q2 is positive denite.
The matrix Q2 is positive denite, if the s
alar uT Q2 u is positive for all values u 6= 0. This
imply that nonzero
ontrol signals will give positive
ontribution to the performan
e fun
tion.
Q1 is positive semidenite, if xT Q1 x is positive or zero for all x. This imply that we will
allow some states or linear
ombinations of states to give zero
ontribution to the performan
e
fun
tion.
9
Page 10 of 61
The three Q-matri
is are further symmetri
and will therefore be equal to their own transpose
Note that with our
hoi
e of performan
e fun
tion we have
hosen to have the possibility to
be able to give a spe
ial attention to the nal state using the matrix QN .
It is obvious, that you will usually want the nal state to be as
lose to the desired value for now a ve
tor of zeros - as possible
If we are not spe
ially interested in the nal state, we
an put QN = Q1 and obtain
N
X
(xT (k)Q1 x(k) + uT (k)Q2 u(k)),
I=
u(N ) = 0
k=0
For a simple rst order system we have earlier
al
ulated an optimal input sequen
e. This
turned up to give inputs u(k) whi
h for ea
h step were proportional to the
urrent state x(k),
a linear but time varying feedba
k
ontrol law. Further the
ontribution to the performan
e
fun
tion from the samples k . . . N turned out to be a quadrati
fun
tion of of the
urrent
state x(k) We will now try to generalize this experien
e to an n-dimensional system, that is
we assume:.
u(k) = L(k)x(k)
JkN (x(k))
= xT (k)S(k)x(k)
We will seek expressions for the matri
es L(k) and S(k) in the expressions where L(k) has
the dimension (p n) and S(k) has the dimension (n n). We start with the general dynami
programmed expression from earlier
N
JkN (x(k)) = min[H(x(k), u(k)) + Jk+1
(x(k + 1))]
u(k)
for the ontrol signal u (k) whi h will give us the minimal performan e JkN (x(k)) :
JkN (x(k)) = min[xT (k)Q1 x(k) + uT (k)Q2 u(k) + xT (k + 1)S(k + 1)x(k + 1)]
u(k)
To keep expressions of a reasonable size we relax the notation and leave out the argument k
for the ve
tors x og u (but of
ourse not for the matrix S whi
h is in fo
us now !).
To nd the optimal
ontrol signal u at the time (sample number) k the performan
e fun
tion
is dierentiated with respe
t to u:
Optimal Control
Page 11 of 61
JkN (x)
u
If [Q2 + T S(k + 1)] is invertible this
an be solved to give the optimal
ontrol signal
(re-entering argument k )
= xT S(k)x
Optimal Control
Page 12 of 61
Or
I=
N
1
X
k=0
u (k) = L(k)x(k)
with
S(N ) = QN
The value of the performan
e fun
tion using this input sequen
e will be:
Note that we have assumed Q2 to be positive denite. This is not allways stri
tly nessessary
sin
e it is su
ient that [Q2 + T S(k + 1)] is positive denite
The rst and the last expression for S(k) will be best suited for re
ursive
al
ulations be
ause
they do not in
lude matrix invertions.
The se
ond expression is often suitable if you wish to let N and nd the value of
S(0). This may be done on the
ondition that the re
ursive equation for S(k)
onverge to
a stationary value whi
h
an be found by letting S(k) = S(k + 1) = S whi
h leads to an
Algebrai
Ri
ati Equation
It is important to note:
In some problems the interval [0; N ] will represent the only interesting time interval and
Optimal Control
Page 13 of 61
you will not be interested in the behaviour of the plant after N samples.
Here u(k) = L(k)x(k), is a time varying feedba
k and the
ontrol task is nished after
N samples.
In other problem areas i.e.
ontrol systems u(k) = L(0)x(k), represents a
onstant
feedba
k.
This means that you at ea
h sample time let the time horison of the performan
e fun
tion
be N samples ahead in time.
In the performan
e fun
tion k = 0 represents the
urrent time and you push the time
horison N samples in front of you. This is
alled re
eding horizon
ontrol.
These notes will fo
us on the se
ond
ontrol appli
ation
This leads to the following pro
edure for
al
ulation of the optimal
ontroller
1.
2.
3.
4.
5.
k := N
S(k) = QN
REPEAT
k := k 1
L(k) = [Q2 + T S(k + 1)]1 T S(k + 1)
S(k) = Q1 + LT (k)Q2 L(k) + ( L(k))S(k + 1)( L(k))
= Q1 + T S(k + 1) T S(k + 1)[Q2 + T S(k + 1)]1 T S(k + 1)
= Q1 + T S(k + 1)[ L(k)]
UNTIL k = 0
Use the linear feedba
k u(k) = L(0)x(k)
The obtained performan
e fun
tion will be
J0N (x(0)) = xT (0)S(0)x(0)
We will illustrate the algorithm using our previously used 1st order system where we the
al
ulated the optimal inputsequen
e using minimum sear
h in ea
h step.
We repeat the system
x(k + 1) = ax(k) + bu(k)
N
1
X
k=0
Optimal Control
Page 14 of 61
S(2) = 1
ab
q + b2
ab
a2
S(1) = 1 + a(a b
)
=
1
+
q
q + b2
q + b2
a2 1
a2
L(0) = [q + b2 [1 + q
]
ab(1
+
q
)
q + b2
q + b2
L(1) = [q + b2 ]1 ab =
a
ab(1 + q q+b
2)
2
a
q + b2 (1 + q q+b
2)
2
a
ab(1 + q q+b
2)
a2
)(a
b
)
S(0) = 1 + a(1 + q
2
2
a
q+b
q + b2 (1 + q 2 )
q+b
= 1+q
a2
a2 (1 + q q+b
2)
2
a
+ b2 (1 + q q+b2 )
L(0) =
a
ab(1 + q q+b
2)
2
a
q + b2 (1 + q q+b
2)
J02 (x(0))
=1+q
a
a2 (1 + q q+b
2)
2
a
q + b2 (1 + q q+b
2)
x2 (0)
Most often we will not need to know the value of the performan
e fun
tion.
Now let us
ontinue the example and
al
ulate the optimal
ontroller, when we put the time
horison to innity, N = .
L(k) = [Q2 + T S(k + 1)]1 T S(k + 1)
S(k) = Q1 + T S(k + 1) T S(k + 1)[Q2 + T S(k + 1)]1 T S(k + 1)
ab S(k + 1)
q + b2 S(k + 1)
S(k) = 1 + a2 S(k + 1)
a2 b2 S 2 (k + 1)
q + b2 S(k + 1)
Page 15 of 61
a2 b2 S 2
q + b2 S
2
b + qa2 q
q
0 = S2
S 2
2
b
b
S = 1 + a2 S
p
1 2
2
(b2 + qa2 q)2 + 4qb2 ]
[b
+
qa
q
+
2b2
Inserting the stationary S in the expression for L(k) give the stationary value of L:
p
a b2 + qa2 q + (b2 + qa2 q)2 + 4qb2
p
L=
b b2 + qa2 + q + (b2 + qa2 q)2 + 4qb2
Optimal Control
Page 16 of 61
Often it will be ne
essary to
hange single elements relatively mu
h, say a fa
tor 10 to see
substantial dieren
es in the
losed loop response.
If you know your open loop plant well enough to spe
ify maximal values of ea
h state and
input (
ontrol) signal, that might be from physi
al limitations or as an assesment of desired
maximal deviations you
an very straight forward
hose the diagonal elements as
Q1 (i, i) =
1
x2i,max
Q2 (i, i) =
1
u2i,max
In addition to you should use a s
alar weighting fa
tor to balan
e tight
ontrol and e
onomy
The performan
e fun
tion now be
omes
I =
=
N
X
k=0
N
X
k=0
u2 (k)
x21 (k)
x2 (k)
x2 (k)
u2 (k)
u2 (k)
] + [ 21
])
+ 22
+ + 2n
+ 22
+ + 2r
2
xn,max
ur,max
x1,max x2,max
u1,max u2,max
The elements in the performan
e fun
tion are now s
aled by their desired maximal value to
be in the interval [0; 1].
If you are able to assess thes max-values there is only one degree of freedom left represented
by the s
alar with the task to weight tight
ontrol against e
onomy.
Optimal Control
Page 17 of 61
dx(t)
= Ax(t) + Bu(t)
dt
and the quadradi
performan
e fun
tion
I=
with Q1 , Q2 and QT positive denite. The optimal input signal will be given by u(t) =
L(t)x(t), where
T
L(t) = Q1
2 B S(t)
with
dS(t)
dt
T
= Q1 + AT S(t) + S(t)A S(t)BQ1
2 B S(t)
S(T ) = QT
The performan
e fun
tion using this inputsignal will be:
Optimal Control
Page 18 of 61
Dis
rete
L(k) = 1
u(k) = L(k)x(k) is inserted in the state equations to obtain
the
losed loop equations:
x(k + 1) = x(k) + u(k)
x(k + 1) = x(k) + (1 x(k)) = 0
This is a dead-beat regualtor, taking the state from the initial
x(0) to origo in the state spa
e in one step
Continous
L(t) =
I.e u(t) =
It takes inte available input (
ontrol) signal to take
the state from a nite initial state x(0) to origo
in the state spa
e in zero time.
For this reason we require Q2 to be positive denite -
Optimal Control
Chapter 4
Stationary LQ-Controllers
Continuous time:
T
L(t) = Q1
2 B S(t)
dS(t)
T
If we seek an optimal
ontroller for a problem where the performan
e index is a sum or integral
over a time interval in
reasing towards innity, the
ontroller will be
ome independent of time
L(0) = L(k) = L(k + 1) = L . The steady state value of L and the
orresponding steady state
value of S(0) = S(k) = S(k + 1) = S are solutions of a set of steady state Ri
ati equations
Dis
rete time:
L = [Q2 + T S]1 T S
S = Q1 + T S T S[Q2 + T S]1 T S
Continuous time:
19
Page 20 of 61
T
L = Q1
2 B S
T
0 = Q1 + AT S + SA SBQ1
2 B S
In both
ases the equations for S are nonlinear and di
ult to solve. The equations are
referred to as algebrai
Ri
ati equations, ARE's.
If the order of the system is three or larger the equations are almost impossible to solve
by hand. Fortunately it is possible to solve the equations iteratively, simply by using the
same algorithm as in the
ase with nite time horizon in the performan
e sum with the one
dieren
e, that you do not stop the iteration loop for S(k) and L(k) after a xed number (N )
of repetitions, but
ontinue the iteration until no more signi
ant (using a suitable
riterion)
hanges of S(k) and L(k) appear.
Solutions of the stationary Ri
ati equations in
ontinous or dis
rete time may also be found
using the MatLab Control Toolbox fun
tion lqr.
Having found S and L we will use the optimal
ontroller u(k) = Lx(k). The obtained value
of the performan
e index will be
J0 (x(0)) = xT (0)Sx(0)
In pra
ti
e there is seldom any use for the value of the performan
e fun
tion. However the
performan
e fun
tion
an only obtain a stationary value if the state ve
tor x(k) goes to origo
in state spa
e. An optimal
ontroller using the steady state value of L to
al
ulate the
ontrol
signal will therefore always for
e the state from an initial value x(0) towards origo.
=
x(t) +
u(t) = Ax(t) + Bu(t)
0 1.53
76.7
u(s)
76.7
s + 1.53
Figure 4.1:
x2 (s)
1
s
x1 (s)
y(s)
0.013
ms:
1 0.009924
0.003816
x(k + 1) =
x(k) +
u(t) = x(k) + u(k)
0
09848
0.7612
y(k) = [0.013 0]x(k) = Hx(k)
Optimal Control
Page 21 of 61
We will use the performan
e fun
tion
X
(xT (k)Q1 x(k) + u(k)Q2 u(k))
I=
0
Q1 =
1 0
0 0
As it may be seen only the state variable representing position is weighted, while there is no
weight on the state variable representing speed.
The weight matrix Q2 is in this
ase with only one input a s
alar q2 . We will now give q2
several values and
al
ulate the optimal
ontroller in ea
h
ase using the general re
ursive
equations until L(k) has a
hieved its steady state value within 1 %.
The result is
q2
q2
q2
q2
:
:
:
:
=0
= 0.00001
= 0.001
= 0.1
Using these four
ontrollers we have simulated
losed loop progress of the output y(t) and the
input u(t) with the initial state
x(0) =
x1 (0)
x2 (0)
1
0
The results of these simulations are shown in gures 4.2, 4.3, 4.4 and 4.5.
q2 = 0.00001
This gives a very fast progress of the output with a small overshoot, but still the
ontrol input
is very large with a maximum of approximately 100.
q2 = 0.001
Optimal Control
Page 22 of 61
This gives a reasonably fast progress for the output with a small overshoot, and now the
ontrol input is more reasonable with a maximum of 22.
q2 = 0.1
This gives a very slow progress for the output, and only a very small
ontrol signal with a
maximum of 2.7
1000
500
0
-500
-1000
0
0.015
*
0.01
0.005
Figure 4.2:
100
50
0
-50
-100
-150
0
15 x10
-3
10
*
5
0
-5
0
Figure 4.3:
Optimal Control
Page 23 of 61
10
0
-10
-20
-30
0
15 x10
-3
10
*
*
*
*
*
0
-5
0
Figure 4.4:
1
0
-1
-2
-3
0
0.015
*
0.01
*
*
*
0.005
Figure 4.5:
Optimal Control
Page 24 of 61
y(t) + ay(t)
= u(t)
A state spa
e model of the system
an be given with the state variables x1 (t) = y(t) and
x2 (t) = y(t)
x 1 (t) = y(t)
= x2 (t)
x 2 (t) = y(t) = ay(t)
+ u(t) = ax2 (t) + u(t)
These equations
ombine dire
tly to a full state spa
e model
0
0 1
u(t)
x(t) +
x(t)
=
1
0 a
y(t) = [1 0]x(t)
I=
(x (t)
1 0
0 0
x(t) + u(t)qu(t))dt
A=
0 1
0 a
B=
0
1
Q1 = Q =
1 0
0 0
Q2 = q
We have
hosen only to "punish" the output y(t) = x1 (t), and put no restri
tions on the
se
ond state variable x2 (t). We partition the steady state value of the matrix S(t) as below
S=
S11 S12
S21 S22
S11 S12
S12 S22
L=
T
Q1
2 B S
1
= [0 1]
q
S11 S12
S12 S22
1
[S12 S22 ]
q
Optimal Control
Page 25 of 61
dS(t)
dt
0 0
0 0
1
0
1
0
T
= Q1 + AT S + SA SBQ1
2 B S
0
0 0
S11
S11 S12
+
+
0
S12
1 a
S12 S22
S11 S12
0 1
S11 S12
[0 1]
S12 S22
S12 S22
1 q
0
0
0
0
+
+
0
0
S11 aS12 S12 aS22
S12 1
[S S ]
S22 q 12 22
S12
S22
0 1
0 a
S11 aS12
S12 aS22
This may be written as three s
alar equations (the two o diagonal elements give the same
equation be
ause S is symmetri
):
1 2
1. 0 = 1 S12
q
1
2. 0 = S11 aS12 S12 S22
q
1 2
3. 0 = 2S12 2aS22 S22
q
From 1. we obtain : S12 =
a2 q 2 + 2q q
1
u(t) = [S12 S22 ]x(t)
q
1
= (S12 x1 (t) + S22 x2 (t))
q
q
1
x(t)
= (A BL)x(t)
0 1
0 1
= (
[S S ])x(t)
0 a
1 q 12 22
0
1
x(t)
=
Sq12 a Sq22
The
losed loop poles are determined by the
hara
teristi
equation for the matrix A BL
Optimal Control
Page 26 of 61
det(sI (A BL)) = 0
S22
S12
s2 + (a +
)s +
= 0
q
q
Comparing this with the
hara
teristi
equation for a se
ond order system in standard form
you will nd
1
hara
teristi
frequen
y : n =
q 1/4r
a2 q
1
damping ratio
:
=
1+
2
2
For some extreme values of a and q we nd:
q=0
1
2
q=
0
a=0
1
q 1/4
1
2
These values found with the LQ
riterion justify, that
lassi
ontroller design often uses
values of the damping ratio in the proximity of 12
For pure inertia systems, y = kst u(t) (i.e. a = 0) a damping ratio of
independent of the
hoi
e of weighting q .
Optimal Control
1
2
will be optimal
Chapter 5
Referen
es and disturban
es
Until now we have
onsidered optimal
ontrol of a system des
ribed in state spa
e using a
quadrati
performan
e fun
tion. The purpose was to bring the system from an initial state
to origo in state spa
e. The balan
e between speed and reasonable use of
ontrol signal was
determined by weighting in the performan
e index.
A pra
ti
al
ontrol problem will naturally most often be to make the output y(t) tra
k a
referen
e signal r(t) and/or keep the output
lose to the referen
e without using too large
ontrol signals u(t). Furthermore the system will most often have disturban
es inuen
ing
the output.
It is obvious, that it is not possible to optimise a
ontroller for all possible referen
e signals
and disturban
es. In
lassi
al
ontrol it is a
ommon pra
ti
e to adjust parameters in for
example a PID
ontroller to give the "best possible" step response. However this does not
imply, that the response of other referen
e forms is "best possible".
If you use optimal
ontrol the same
onditions are valid. It is possible to nd a
ontroller
whi
h is optimal to
ertain
lasses of referen
e- and disturban
e signals
x(k + 1) = r x(k)
r(k) = Hr x(k)
A des
ription like this without any input, just initiated by the value of x(0) is
alled an
autonomous state spa
e des
ription see gure 5.1.
27
Page 28 of 61
x(t)
r(t)
Hr
z -1
r
Figure 5.1:
K is any
onstant.
The state spa
e des
ription of this is very simple:
x(k + 1) = x(k),
x(0) = K
r(k) = x(k)
b) Ramp, r(t) = K1 t
K1 is an arbitrary
onstant.
r(t) may be modelled as the solution of the dieren
e equation (Ts : sampling time):
r(k) r(k 1)
r(k 1) r(k 2)
=
Ts
Ts
or:
2 1
x(k + 1) =
x(k),
1 0
r(k) = [1 0]x(k)
Optimal Control
x(0) =
0
K1
Page 29 of 61
) A
eleration,
r(t) = K2 t2
r(k1)r(k2)
Ts
Ts
r(k1)r(k2)
Ts
r(k2)r(k3)
Ts
Ts
or:
x3 (k + 1) = x2 (k)
x2 (k + 1) = x1 (k)
x1 (k + 1) = 3x1 (k) 3x2 (k) + x3 (k)
and a
hieve
3
x(k + 1) = 1
0
r(k) = [1 0
3 1
0 0 x(k),
1 0
0]x(k)
0
x(0) = K2
4K2
Similar des
riptions may be found for
ontinuous time signals. As an example we show the
model of a
osine
K3 is an arbitrary
onstant.
r(t) is the solution of the dierential equation :
r(t) = a2 r(t)
We introdu
e the state variables x1 (t) = r(t) and x2 (t) = r(t)
, to a
hieve:
2
x 1 (t) = x2 (t) and x 2 (t) = a x1 (t), or as a full state spa
e des
ription
0
1
x(t)
=
a2 0
r(t) = [1 0]x(t)
x(t),
Optimal Control
x(0) =
K3
0
Page 30 of 61
e) System with more than one referen
e
If a system has m outputs, the referen
e ve
tor must also have m elements. We will show an
example of this:
A system with two outputs should also have two referen
es. We want to model one referen
e
as a
onstant of size K . The other referen
e is modelled as a ramp with slope K1 .
The
onstant and the ramp may be modelled as shown below in an autonomous state spa
e
model with two outputs
K
xr (0) = 0
K1
1 0 0
xr (k + 1) = 0 2 1 xr (k)
0 1 0
The rst state variable represents the
onstant and the se
ond state variable represents the
ramp. The third state variable is not present in the referen
e ve
tor whi
h is taken as output
in 2 3 output matrix with zeros in the third
olumn.
r(k) =
The full referen
e model is thus
xr (k + 1)
r(k)
1 0 0
0 1 0
xr (k)
= r xr (k),
1 0 0
where r = 0 2 1
0 1 0
= Hr xr (k),
where Hr =
1 0 0
0 1 0
Optimal Control
K
and xr (0) = 0
K1
Chapter 6
Control of System with a Referen
e
Model
System model
Referen e model
xs (k + 1)
y(k)
xr (k + 1)
r(k)
=
=
=
=
s xs (k) + s us (k)
Hs xs (k)
r xr (k)
Hr xr (k)
I=
N
1
X
k=0
The system state xs (k) is augmented by the referen
e state xr (k), giving the augmented state
ve
tor x(k) = [xTs (k) xTr (k)]T . The augmented system is des
ribed by
xs (k)
s
=
+
u(k)
xr (k)
0
xs (k)
e(k) = [Hs | Hr ]
xr (k)
xs (k + 1)
xr (k + 1)
s 0
0 r
31
Page 32 of 61
s 0
0 r
s
0
H = [Hs | Hr ]
I =
=
N
1
X
k=0
N
1
X
k=0
N
1
X
k=0
N
1
X
k=0
Note that this performan
e fun
tion has the stru
ture we have used earlier. Therefore we
an use the re
ursive expressions derived earlier to nd L(k) and S(k), using the weighting
matri
es
Q1 = H Q1e H =
HTs
HTr
QN = H QN e H =
Q1e [Hs | Hr ]
HTs QN e Hs HTs QN e Hr
HTr QN e Hs HTr QN e Hr
xs (k)
u(k) = L(0)x(k) = [Ls (0) | Lr (0)]
xr (k)
= Ls (0)xs (k) Lr (0)xr (k)
A blo
k diagram of this
ontrol system is shown in gure 6.1
Most often the referen
e
an be modelled as a step with the autonomous model
xr (k + 1) = Ixr (k)
r(k) = Ixr (k)
This results in the stru
ture shown in gure 6.2:
Optimal Control
Page 33 of 61
u(k)
x(t)
-L r (0)
z -1
-1
y(k)
x (k)
Hs
-L s (0)
Hr
r(k)
Figure 6.1:
r(k)
-L r (0)
-1
y(k)
x (k)
Hs
-L s (0)
Figure 6.2:
We will now show that the feedba
k from the states of the original system Ls (0) xs (k), will
be equal to feedba
k we would get if the referen
e had been zero as it was earlier.
For the augmented system we have the equations
Optimal Control
Page 34 of 61
S(k) =
T
s
HTs Q1e Hs HTs Q1e Hr
S11 (k + 1) S12 (k + 1)
0
+
0 r
0
=
Q2 + [s | 0]
S21 (k + 1) S22 (k + 1)
0
s 0
S11 (k + 1) S12 (k + 1)
T
[s | 0]
S21 (k + 1) S22 (k + 1)
0 r
1
s
=
Q2 + [Ts S11 (k + 1) | Ts S12 (k + 1)]
0
s 0
T
T
[s S11 (k + 1) | s S12 (k + 1)]
0 r
= [Q2 + Ts S11 (k + 1)s ]1 [Ts S11 (k + 1)s | Ts S12 (k + 1)r ]
From this we
an pull out the sub matrix
orresponding to the original system:
I =
=
N
1
X
k=0
N
1
X
k=0
We have now shown the interesting result that Ls (0), the feedba
k proportionality matrix
from the system state xs (k) is independent of the presen
e of a referen
e ve
tor and
an be
obtained using the weight matrix
Optimal Control
Page 35 of 61
Referen e
xs (k + 1)
y(k)
xr (k + 1)
r(k)
=
=
=
=
as xs (k) + bu(k)
cxs (k)
xr (k),
xr (k)
xr (0) = K
With e(t) = r(t) y(t), we seek a minimum for the performan e index
I =
=
N
1
X
k=0
N
1
X
k=0
Be
ause the weighting matri
es are s
alars we
an x q1e = 1 and
onsider q2 = q in the interval
[0; ]. For the augmented system the two dimensional state ve
tor xT (k) = [xs (k) | xr (k)]
is introdu
ed giving the model:
a 0
b
x(k + 1) =
x(k) +
u(k) = x(k) + u(k)
0 1
0
e(k) = [c 1]x(k) = Hx(k)
The performan
e index is:
I=
N
1
X
k=0
Q1 = H q1e H =
Q2 = q 2 = q
T
QN e = H qN e H =
c
1
c
1
1[c 1] =
c2 c
c 1
1[c 1] = qN e
c2 c
c 1
Combining this with the augmented matri
es , and H you may nd L(0) and S(0) using
the usual re
ursive expressions.
Optimal Control
Page 36 of 61
This leads to the
ontroller
u(k) = L(0)x(k)
xs (k)
= [Ls (0) | Lr (0)]
xr (k)
= Ls (0)xs (k) Lr (0)xr (k)
u(k)
x (k)
-L r (0)
x (k)
c
z-a
-L s (0)
Figure 6.3:
Ls (k) =
abS11 (k + 1)
,
q + b2 S11 (k + 1)
that is, pre
isely the expressions you would a
hieve if the referen
e was zero and you
hose to
weight the output y(t) using the weighting matrix Q1y = Q1e .
Optimal Control
Page 37 of 61
1 0.009924
x(k + 1) =
0 0.9848
y(k) = [0.013 0]x(k)
We used
Q1 =
x(k) +
1 0
0 0
0.003816
0.7612
u(t)
I.e. the state x1 (k) was punished with fa
tor 1,
orresponding to a weight on the output y(t)
of 1/(0, 013)2 = 5917, while the state x2 was weighted with a fa
tor 0.
Simulations showed that a suitable value of Q2 = q2 would be 0, 001.
For this example we will augment the state des
ription with a referen
e state as a third state
variable:
1 0.009924 0
0.003816
x(k + 1) = 0 0.9848 0 x(k) + 0.7612 u(t)
0
0
1
0
e(k) = [0.013 0 1]x(k)
we will use the performan
e index
I=
Again we
hoose q2 = 0.001 and q1e = 5917, i.e. the same weight on the output as we used
when the referen
e was zero. We a
hieve:
Q1
0.013
5917[0.013 0 1]
=
0
1
1
0 76.92
=
0
0
0
76.92 0 5917
Optimal Control
Page 38 of 61
xT (0) = [0 0 1]
The 3rd state variable whi
h is the referen
e stateis initialized to 1 introdu
ing a unity step
on the referen
e with the system state initially being zero.
It
an be seen that L(0) is un
hanged from the earlier
al
ulation resulting in the same
losed
loop dynami
s. This is
onrmed by the simulated step response.
Note that the steady state error is zero only be
ause the system itself has an integration. If
this had not been the
ase you would have seen a steady state error be
ause the
ontroller in
prin
iple is a proportional
ontroller.
2000
1500
1000
500
0
-500
0
10
12
14
16
18
20
1.5
*
*
0.5
*
*
0*
0
Figure 6.4:
10
12
14
16
18
20
Closed loop simulation of step response, upper plot u(t). Lower plot y(t)
Optimal Control
Chapter 7
Using a Disturban
e Model in the
Control Law
Model of system
Model of disturban e
xs (k + 1)
y(k)
xd (k + 1)
d(k)
I=
N
1
X
k=0
The system state ve
tor xs (k) is augmented with the disturban
e state ve
tor xd (k), thus
the augmented state ve
tor in
ludes x(k) = [xTs (k) xTd (k)]T . The equations des
ribing the
augmented states are:
s d Hd
xs (k)
s
=
+
u(k)
0
d
xd (k)
0
xs (k)
y(k) = [Hs | 0]
xd (k)
xs (k + 1)
xd (k + 1)
s d Hd
0
d
s
0
H = [Hs | 0]
Page 40 of 61
I =
=
N
1
X
k=0
N
1
X
k=0
N
1
X
k=0
We observe that the performan
e index has the same form as we have used in general derivations, therefore we
an use the earlier derived general re
ursive expressions to determine L(k)
og S(k) using
Q1
QN
HTs Q1y Hs 0
= H Q1y H =
Q1y [Hs | 0] =
0
0
T
T
Hs Q N y Hs 0
Hs
T
QN y [Hs | 0] =
= H QN y H =
0
0
0
T
HTs
0
xs (k)
u(k) = L(0)x(k) = [Ls (0) | Ld ]
xd (k)
= Ls (0)xs (k) Ld (0)xd (k)
Also in this
ase it
an be shown that Ls (0), the feedba
k from the system state ve
tor is
pre
isely the same as if there was no disturban
e.
Ls (0) is in other words independent of the disturban
e model. The stru
ture of the
ontrolled
system is shown in gure 7.1 .
Optimal Control
Page 41 of 61
z 1
x (k)
d
L d (0)
d Hd
u(k)
+
z 1
y(k)
x s (k)
Hs
L s (0)
Figure 7.1:
Optimal Control
Chapter 8
LQ-Control with integral a
tion
In
hapter 7 and
hapter 6 we introdu
ed state ve
tors representing referen
es and disturban
es a
ting on the systems. Controllers for systems in
luding a disturban
e model rely on
measuements of the disturban
e state ve
tor. If a model and measurements of the disturban
es
are available this method may be suitable.
If no detailed model and measurements of the disaturban
es are available it may be a better
approa
h to a
ount for at least
onstant disturban
es by the introdu
tion of an extra state
ve
tor representing the integral of the
ontrol error:
xi (k + 1) = xi (k) + e(k)
where e(k) = r(k) y(k) is the
ontrol error:
I =
N
1
X
k=0
T
xs (k + 1)
y(k)
xr (k + 1)
r(k)
xi (k + 1)
e(k)
= s xs (k) + s us (k)
= H s xs (k)
= r xr (k)
= H r xr (k)
= xi (k) + e(k)
= H r xr (k) H s xs (k)
With the augmented state ve
tor x(k) = [xTs (k) xTr (k) xTi (k)]T the state spa
e des
ription
is
42
Page 43 of 61
s
0 0
s
x(k + 1) = 0
r 0 x(k) + 0 u(k) = x(k) + u(k)
H s H r I
0
y(k) = [H s | 0 | 0]x(k) = H y x(k)
e(k) = [H s | H r | 0]x(k) = H e x(k)
xi (k) = [0 | 0 | I]x(k) = H i x(k)
The performan
e index is rewritten
I =
N
1
X
k=0
T
N
1
X
k=0
T
N
1
X
k=0
with
Q1 = H Te Q1e H e + H Ti Q1i H i
H Ts (k)Q1e H s H Ts (k)Q1e H r 0
= H Tr (k)Q1e H s H Tr (k)Q1e H r
0
0
0
Q1i
QN
= H Te QN e H e + H Ti QN i H i
H Ts (k)QN e H s H Ts (k)QN e H r
0
= H Tr (k)QN e H s H Tr (k)QN e H r
0
0
0
QN i
xs (k)
u(k) = L(0)x(k) = [Ls (0) | Lr | Li ] xr (k)
xi (k)
= Ls (0)xs (k) Lr (0)xr (k) Li (0)xi (k)
Optimal Control
Page 44 of 61
orresponding to the stru
ture shown in gure
u(k)
x r (k)
z -1
??
z -1
-L r (0)
x s (k)
-L i (0)
x i (k)
1
-L s (0)
z-1
y(k)
r(k)
H
Figure 8.1:
Hs
Stru ture of the optimal ontroller with referen e state and integral state.
Referen e model
Integral state
xs (k + 1)
y(k)
xr (k + 1)
r(k)
xi (k + 1)
e(k)
Performan e index
I=
N
1
X
k=0
With the augmented state ve tor x(k) = [xs (k) xr (k) xi (k)]T the state equation is
a 0 0
b
x(k + 1) = 0 1 0 x(k) + 0 u(t) = x(k) + u(k)
c 1 1
0
y(k) = [c 0 0]x(k) = H y x(k)
e(k) = [c 1 0]x(k) = H e x(k)
xi (k) = [0 0 1]x(k) = H i x(k)
Optimal Control
Page 45 of 61
The performan
e index is
I =
N
1
X
k=0
T
N
1
X
k=0
with
Q1
Q2
QN
c
0
=
1
q1e [c 1 0] + 0 q1i [0 0 1]
0
1
q1e c2 q1e c 0
=
q1e c
q1e
0
0
0
q1i
= q2
qN e c2 qN e c 0
= qN e c
qN e
0
0
0
qN i
xs (k)
u(k) = L(0)x(k) = [Ls (0) | Lr | Li ] xr (k)
xi (k)
= Ls (0)xs (k) Lr (0)xr (k) Li (0)xi (k)
Optimal Control
??
Page 46 of 61
u(k)
r(k)
z -1
-L r (0)
x s (k)
-L i (0)
x i (k)
1
-L s (0)
z-1
y(k)
+
Figure 8.2:
Hs
The stru ture of the optimal ontroller with referen e and integral state
Optimal Control
Chapter 9
Sto
hasti
LQ Control with Full State
Information
In the
ase with in
omplete state information some states are not measured and in addition the available measurements may be noisy. Therefore it is ne
essary to re
onstru
t the
state ve
tor using measured values of the outputs from the system, y(t), using an observer,
also
alled a state estimator.
Page 48 of 61
At ea
h sampling time k we
an measure x(k) and determine u(k). In the time between
samples k and k + 1 the
ontrol signal u(k) will be known but the next state ve
tor x(k + 1)
will be un
ertain be
ause of the non measurable sto
hasti
noise ex (k).
The system has n state variables and p inputs. has dimensions n n and dimensions
n p. Furthermore we will assume that the system is stabilizable and that the input signal
u(k) is unlimited.
The state noise ex (k) is a ve
tor of dimension n
onsisting of white normal distributed sequen
es with
Expe
ted value:
Varian
e:
Covarian
e matrix:
E{ex (k)} = 0
E{ex (k)eTx (k)} = Rex (symmetri
, (n n))
E{ex (k)eTx (k + l)} = Rex (l)
E{x(0)} = xm (0)
E{(x(0) xm (0))(x(0) xm (0))T } = Rx (0)
I0N
N
1
X
k=0
we will not be able to tell in advan
e (at sample 0) whi
h values of u(k) will minimize the
performan
e fun
tion or what the value of the minimum will be. These values depend on
inputs ex (k) whi
h are not known at the time we need the to de
ide value of u. If we know
sto
hasti
properties of ex as we assume, we may also
al
ulate sto
hasti
properties resulting
from sequen
es u1 and u2 for instan
e if E{I1 } < E{I2 }.
We will use the expe
ted value of the squared sum dire
tly as performan
e fun
tion:
I0N
N
1
X
= E{
k=0
N
X
= E{
H(x(k), u(k))}
k=0
Optimal Control
Page 49 of 61
IkN
min
IN
u(k), ,u(N ) k
N
X
= E{
H(x(i), u(i))}
i=k
N
X
H(x(i), u(i))}
min
E{
u(k), ,u(N )
i=k
min
IN
u(k), ,u(N ) k
N
X
min
E{
H(x(i), u(i))|x(k)}
u(k), ,u(N )
i=k
min
E{H(x(k), u(k)) +
u(k), ,u(N )
N
X
i=k+1
min
(H(x(k), u(k)) + E{
u(k), ,u(N )
N
X
H(x(i), u(i))|x(k)}
i=k+1
H(x(i), u(i))})
= min(H(x(k), u(k)) +
min
E{
u(k)
u(k+1), ,u(N )
N
X
i=k+1
= min(H(x(k), u(k)) + E{
min
E{
u(k)
u(k+1), ,u(N )
H(x(i), u(i))})
N
X
i=k+1
Here E{. . . |x(k)} designates expe
tation on
ondition of x(k). We will now introdu
e the
term
N
X
min
E{
JkN (x(k)) =
H(x(i), u(i))|x(k)}
u(k), ,u(N )
i=k
min
IN
u(k), ,u(N ) k
= E{JkN (x(k))}
with
N
(x(k + 1))})
JkN (x(k)) = min(H(x(k), u(k)) + E{Jk+1
u(k)
min
IN
u(k), ,u(N ) k
= E{JkN (x(k))}
with
N
(x(k + 1))})
JkN (x(k)) = min(xT (k)Q1 x(k) + uT (k)Q2 u(k) + E{Jk+1
u(k)
Optimal Control
Page 50 of 61
E{v T (k)Av(k)}
will be derived for use in the following se
tions. v(k) is assumed to be a sto
hasti
pro
ess
with expe
tation v m (k) and a
ovarian
e matrix Rv , A is a
onstant quadrati
matrix.
tr[A] is termed the tra e of A and is dened as the sum of the diagonal elements of A
= min[xT (k)Q1 x(k) + uT (k)Q2 u(k) + E{xT (k + 1)S(k + 1)x(k + 1) + w(k + 1)}]
u(k)
= min[xT (k)Q1 x(k) + uT (k)Q2 u(k) +
u(k)
E{(x(k) + u(k) + ex (k))T S(k + 1)(x(k) + u(k) + ex (k)) + w(k + 1)}]
Optimal Control
Page 51 of 61
In the last expression two terms have vanished be
ause ex (k) is un
orrelated to x(k) og u(k).
The expe
tation of ex (k) is zero and the
ovarian
e matrix is Eex (k)T ex (k) = Rex . We
an
using the expression of expe
tation of a quadrati
a form to nd:
f or
u (k) = L(k)x(k)
with
Optimal Control
Page 52 of 61
min I0N
u(0)...u(N )
= E{J0N (x(k))}
= E{xT (0)S(0)x(0) + w(0)}
=
+ tr[S(0)Rx (0)] +
N
1
X
tr[S(k + 1)Rex ]
k=0
The rst term of the minimal performan
e index
orresponds to the deterministi
ase while
the last two terms are due to the noise ae
ting the states.
The two rst terms are
onstant, while the last term is growing with N .
We
an
on
lude, that for the sto
hasti
system with
omplete state information, the optimal
ontroller will be identi
al to the one derived for the deterministi
ase (with no state noise).
The only dieren
e is, that minimal performan
e index is larger in the sto
hasti
ase.
Optimal Control
Page 53 of 61
9.6 Summary of LQ Method for sto
hasti
, dis
rete time systems with
omplete state information
For the linear, dis
rete time, sto
hasti
system
I = E{
k=0
where Q1 and QN are positive denite and Q2 is positive semi denite, the optimal input
sequen
e will be determined by:
u(k) = L(k)x(k),
with
with
and
S(N ) = QN
min I0N
{z
as deterministic
Optimal Control
N
1
X
tr[S(k + 1)Rex ]
k=0
{z
Chapter 10
Sto
hasti
LQ Control with
In
omplete State Information
I = E{
k=0
This apparently
ompli
ated problem - to nd the optimal
ontroller for a system whi
h has
both state noise and output noise implying that the exa
t knowledge of the states is not
available -
an fortunately be solved relatively simple using the separation theorem, whi
h we
will state without proof.
54
Page 55 of 61
Optimal Control
Page 56 of 61
10.3 Summary of LQ method for sto
hasti
, dis
rete time systems with in
omplete state information.
For the linear, dis
rete time, sto
hasti
system
I = E{
k=0
where Q1 and QN er positive denite and Q2 is positive semi denite, the optimal input
sequen
e will be determined by:
u(k) = L(k)
x(k),
with
with
and
S(N ) = QN
x
(k + 1) =
x(k) + u(k) + K(k)[y(k) H x
(k)]
K(k) = P (k)H T [Rey + HP (k)H T ]1
Optimal Control
Page 57 of 61
where
and
P (0) = Rx (0)
min I0N
N
1
X
tr[S(k + 1)Rex ] +
= xTm (0)S(0)xm (0) + tr[S(0)Rx (0)] +
{z
}
|
k=0
som determin.
|
{z
}
due to state noise
N
1
X
|k=0
The stru
ture of the
ontroller is shown in the blo
k diagram in gure 9.1.
Referen
es and Disturban
es
If the referen
e is not zero and/or the disturban
es ex (k) and ey (k) are not white sequen
es
you
an as in the deterministi
ase model referen
e and/or disturban
e and in
lude the states
of this model in an extended state ve
tor.
The system with extended state ve
tor will then have a performan
e index in standard form
and you determine an optimal
ontroller as
u(k) = L(0)
x(k)
where the extended state ve
tor is used
x
s (k)
x
r (k)
Optimal Control
Page 58 of 61
System
e x(k)
e y (k)
u(k)
z -1
y(k)
x(k)
H
Observer
K
z -1
^
x(k)
H
Controller
-L
Figure 10.1:
Controller
Observer
L = [Q2 + T S]1 T S
:
S = Q1 + LT Q2 L + ( L)T S( L)
S(N ) = QNT
K = P H [Rey + HP H T ]1
:
P = Rex + KRey K T + ( KH)P ( KH)T
P (0) = Rx (0)
Optimal Control
Page 59 of 61
immediate you see the following duality:
Controller
Observer
Q2
Rey
T
H
S
P
Q1
Rex
LT
K
QN
Rx (0)
This remarkable
onne
tion makes it possible, that you in prin
iple use the same software for
omputation of the optimal
ontroller and the optimal observer.
You also see, that the re
ursive iteration move ba
kward in time for the
ontroller part and
forward for the observer. In the
al
ulations this is of no importan
e.
x
(k + 1) =
x(k) + u(k) + K(k)[y(k) y
(k)]
y(k) = H x
(k)
Ri ati
x
(k + 1) =
x(k) + u(k) + K(k)
y(k) = y(k) + (k)
y
(k) = H x
(k)
The essential dieren
e between the Kalman-predi
tor and the Innovation model is that
in the Kalman-predi
tor K(k) must be
al
ulated using the often well known , and
H together with the most often poorly known
ovarian
e matri
es Rex and Rey and
Rx (0). On the other hand K in the Innovation model is dire
tly parameterized and
an
be determined using system identi
ation together with , and H -matri
es.
Optimal Control
Chapter 11
Deterministi
Observer Design
For stokastisk modellerede systemer er det netop omtalt hvorledes en optimal observer kan
indfres enten i form af Kalmann-pre
iktoren eller i form af Innovationsmodellen i forbindelse
med systemidentikation.
For deterministisk modellerede systemer kan man ikke tale om en optimal observer, men
her m vi selv vlge K -matri
en, sledes at observeren opnr en passende dynamik i forhold
til systemets dynamik.
Vi vil nu omtale, hvorledes vi kan designe forskellige former for observere, nemlig:
x
(k + 1) =
x(k) + u(k)
y
(k) = H
x(k)
hvor x
(k) er et estimat af x(k). Indfres observerfejlen x
(k) = x(k) x
(k) fs:
x
(k + 1) =
x(k)
60
Page 61 of 61
lukketsljfeobserver:
x
(k + 1) =
x(k) + u(k) + K[y(k) y
(k)]
=
x(k) + u(k) + K[y(k) H
x(k)]
x
(k + 1) = ( K)
x(k)
Dette er en homogen ligning og dynamikken for observerfejlen er bestemt ved ( K).
(k) alts konvergere mod nulvektoren uafhngigt af
Hvis denne har stabile egenvrdier, vil x
(0).
begyndelsesfejlen x
Med andre ord vil x
(k) konvergere mod x(k) uafhngigt af begyndelsesestimatet x
(0), og
dette kan ske hurtigere end den normale (bensljfe) bevgelse af x(k). Hvis og samt
H ikke kendes helt njagtig - eller hvis de skulle variere lidt med tiden - vil observerfejlen
(k + 1) = ( K)
x(k), men ved et fornuftigt valg
naturligvis ikke vre givet eksakt ved x
af K-matri
en vil observeren stadig vre stabil og observerfejlen a
eptabel lille.
(k) = y(k) H
x(k) og en mlefejl eller
Observeren justeres jo efter dieren
en y(k) y
stj vil af observeren fejlagtigt tolkes som et drligt estimat x
(k), hvilket medfrer, at
x
(k) hurtigt justeres til en ny (fejlagtig) vrdi.
Valget af observerpoler bliver sledes et kompromis mellem observerhurtighed og stjflsomhed.
Normalt pla
erer man dog de nskede observerpoler 1 , 2 , . . . , n noget nrmere origo
end systempolerne.
Optimal Control
Page 62 of 61
2. Tilbagekoblingsmatri
en K bestemmes nu sledes, at egenvrdierne for matri
en
KH bliver lig med de nskede observerpoler, alts:
= z n + 1 z n1 + 2 z n2 + . . . + n
= (z)
Bestemmelse af K er naturligvis betinget af at systemet er observerbart d.v.s at observerbarhedsmatri
en O, hvor
O=
H
H
H2
..
.
Hn1
K = ()
H
H
H2
..
.
Hn1
0
0
0
..
.
1
aktuel observer,
Optimal Control
Page 63 of 61
Metoden gr ud p at separere predi
tionen og korrektionen. Hvis vi til tiden k har et estimat
x
(k), vil vi prediktere den nste tilstand ved
x
(k + 1) =
x(k) + u(k)
Til tiden k + 1 mles outputtet y(k + 1), hvorefter x
(k + 1) korrigeres
x
(k + 1) =
x(k) + K[y(k + 1) H
x(k + 1)]
x
(k) kaldes da en aktuel observer for x.
I praksis kan denne observer naturligvis ikke implementeres njagtigt, idet det er umuligt at
mle, udfre beregninger og regulere uden et vist delay.
Imidlertid kan dette delay mellem ind- og udlsning minimeres ved at udnytte sidste del
af hver samplingsperiode til p forhnd at udfre den del af beregningerne, der bygger p
"gamle" vrdier.
Udregnes igen observerfejlen x
(k) = x(k) x
(k), fs ved at eliminere x
(k):
x
(k) = [ KH]
x(k)
Tilbagekoblingsmatri
en K ndes derfor ganske som omtalt under predi
tionsobserveren, idet
H blot erstattes af H, d.v.s. at K bestemmes ved
K = ()
H
H2
H3
..
.
Hn
0
0
0
..
.
1
Optimal Control
Page 64 of 61
Sfremt mlingerne ikke er srlig stjbehftede, kan man derimod njes med at estimere de
ikke mlbare tilstandselementer (redu
ed order observer).
I dette tilflde opdeler man sin tilstandsvektor x(k) i to dele, hvor xa (k) er den del, der
indeholder de mlbare tilstandselementer, og hvor xb (k) indeholder de ikke-mlbare tilstandselementer.
Sidstnvnte skal derfor estimeres som x
b (k).
Systemets tilstandsbeskrivelse omskrives alts p formen:
xa (k + 1)
xb (k + 1)
aa ab
ba bb
xa (k)
xb (k)
a
b
u(k)
x
(k + 1) =
x(k) + u(k) + K[y(k) H
x(k)]
indses, at man nder den redu
erede observer ved at indfre flgende substitutioner:
x(k) xb (k)
bb
x
b (k + 1) = bb x
b (k) + ba x
a (k) + b u(k) +
K[xa (k + 1) aa xaa (k) a u(k) ab x
b (k)]
Dannes som tidligere observerfejlen x
b (k + 1) = xb (k) x
b (k) fs:
Optimal Control
Page 65 of 61
x
b (k + 1) = [bb Kab ]
xb (k)
Tilbagekoblingsmatri
en K ndes derfor som tidligere omtalt ved
K = (bb )
ab
ab bb
ab 2bb
..
.
ab nbbb 1
Optimal Control
0
0
0
..
.
1
Chapter 12
Stability of Multivariable Systems
System
: x(t)
= Ax(t) + Bu(t)
y(t) = Cx(t)
Regulator : u(t) = Lx(t)
u(s) = Lx(s)
RI (s) = Ir + LF(s)B
Opening the loop at the outputside (II ) give us the loop gain T(s) = F(s)BL. and the
return dieren
e matrix is of dimension n equal to the number of state variables
66
Page 67 of 61
u(s)
F (s)B
II
x(s)
- C
y(s)
det(R(s)) =
=
det(sI A + BL)
Pc (s)
=
det(sI A)
Po (s)
Characteristic polynomial f or closedloop
Characteristic polynomial f or openloop
Optimal Control
Page 68 of 61
en
ir
les the origin one time
ounter
lo
k wise for ea
h open loop pole in the right half plane
when s goes one time round the Nyquist path whi
h en
ir
les all the unstable poles
In gure 10.2 this is illustrated for an open loop stable system for whi
h the map must not
en
ir
le the origin
6
Criti
al
point
Complex-plane
(0, 0)
=0
-
(1, 0)
det(R(j))
Figure 12.2:
Sin
e the determinant of a matrix is the produ
t of the eigenvalues of the matrix, whi
h in
this
ase are fun
tions l(j)
det(R(j)) =
r
Y
li (j)
i=1
the
losed loop stability for an open loop stable system
an also be formulated su
h that no
of the eigenvaluefun
tions li (j) for the return dieren
e matrix R(j) must en
ir
le origo of
the
omplex plane when s traverses the Nyquist path.
This may also be illustrated for a stable system
6
Criti
al
point
(0, 0)
Complex-plane
=
=0
-
(1, 0)
li (j)
Figure 12.3:
li (s) = 1 + ki (s)
Optimal Control
Page 69 of 61
where ki (s) ar ethe eigenvalues of the loop transfer matrix T(s) = LF(s)B.
6
Complex-plane
Criti
al
point
U
=
(0, 0)
=0
-
(1, 0)
ki (j)
Figure 12.4:
Closed loop stability using plots of eigenvalues ki (j) of open loop transfermatrix.
x(t)
y(t)
= Ax(t) + Bu(t)
= Cx(t)
Performan e index
R
= R0 (yT (t)Q1y y(t) + uT (t)Q2 u(t))dt
u(t)
L
0
= Lx(t)
T
= Q1
2 B S, hvor
T
T
= C Q1y C + AT S + SA SBQ1
2 B S
with
Controller
with
and
We use the stationary LQ-
ontroller. With the abreviation F(s) = (sIn A)1 we have
Open loop input output matrix
loop gain matrix
Return dieren
e matrix
: W(s)
: T(s)
: R(s)
= CF(s)B
= LF(s)B
= Ir + T(s) = Ir + LF(s)B
CQ1y C = SA AT S + LT Q2 L
On the right hand side we add and subtra t the lapla eoperator s multiplied by S .
Optimal Control
Page 70 of 61
BT FT (s)LT Q2 LF(s)B
= BT FT (s)LT Q2 + Q2 LF(s)B + BT FT (s)LT Q2 LF(s)B
Q2 is added on both sides of the equation
WT (s)Q1y W(s) + Q2 = Q2 (Ir + LF(s)B) + BT FT (s)LT Q2 (Ir + LF(s)B)
= (Ir + BT FT (s)LT )Q2 (Ir + LF(s)B)
= RT (s)Q2 R(s)
We have now a
hieved the following important relation between the return dieren
e matrix
R(s) and the open loop gain matrix W(s) for LQ-
ontrolled systems:
(12.1)
This relation is rewritten yet a bit introdu
ing the eigenvalue fun
tions li (s) and the mat
hing
eigenve
tor fun
tions vi (s) for the return dieren
e matrix R(s) dened by
R(s)vi (s)
T
vi (s)RT (s)
= li (s)vi (s)
= li (s)viT (s)
Now multiply the previous relation with viT (s) from the left and with vi (s) from the right.
Optimal Control
Page 71 of 61
The last term on the right side is a positive number be
ause Q1y is positive denite. Sin
e
also Q2 is positive denite we see that:
eller
This shows that all the eigenvalue fun
tions li (s) of R(s) are numeri
ally larger than 1, whi
h
imply that
r
Y
li (s)| > 1
| det(R(s))| = |
i=1
If we use this result in Nyquist-
onsiderations it simply implies that the plot of det(R(j)) in
the
omplex plane for running from 0 til be stri
tly outside a
irle with
entre in (0, 0)
and with radius 1.
6
Complex-plane
LQ will stay
outside
ir
le.
60
=0
(1, 0)
det(R(j))
> 60
=
With the use of
ontinous time LQ-
ontrol ni
e stabilty margins are ensured.
Unfortunately these results does not extend to dis
rete time systems for whi
h strm and
Wittenmark, have shown that the gain margin is nite.
When the
ontroller is used with an observer the stability margins
an not be guaranteed as
it has been shown by Doyle and Stein although the same authors also have given a re
ipe
(LTR: Loop Transfer Re
overy) for re
overy of the properties.
Optimal Control
Bibliography
[AM89
[W84
K.J. strm and B. Wittenmark. Conputer Controlled Systems: Theory and Design. Prenti
e-Hall Information and System S
ien
es Series. Prenti
e-Hall In
.,
Englewood Clis, NJ, USA, 1984.
Digital Control of
72