Вы находитесь на странице: 1из 123

PART I: Approximation of static systems

1/123

Matrix factorizations
Matrix decompositions are fundamental tools in linear algebra. The most important ones are:
The big six matrix factorizations: Given A Rnm
1

EVD (n = m)

A = XX1

det X 6= 0, = diagonal

2.

SVD

A = UV

U, V: orthogonal = diagonal

3.

Schur decomposition (n = m)

A = UU

U orthogonal, upper triangular

4.

QR factorization

A = QR

Q orthogonal, R upper triangular

5.

LU factorization (n = m)

A = LU

L lower triang, U upper triang.

6.

Cholesky factorization (n = m)

A = LL

L lower (upper) triangular

2/123

The eigenvalue decomposition (EVD) . Given a square matrix A Rnn , its EVD is given by:
A = XX1

AX = X,

where det X 6= 0 and is the diagonal matrix of eigenvalues:

1
..

.
n

, i C

Therefore the columns of X = [x1 xn ], are the eigenvectors corresponding to the i s:


Axi = i xi , i = 1, , n.
Issues concerning the above decomposition:
It is not always possible to diagonalize a matrix in this form. For instance the Jordan block

A=

0
0

1
0


,

is not diagonalizable because its eigenvectors do not form a basis for R2 .

3/123

The eigenvalues may be complex. For instance, the eigenvalues of



A=

0
1

1
0


,

which are the roots of the characteristic polynomial, are purely imaginary:
det(I A) = 0

2 + 1 = 0

are

1,2 = j C.

The EVD of symmetric matrices A, i.e. A = AT , or Hermitian matrices, i.e. A Cnn ,


A = A , has two important properties.
(i) The eigenvalues are real.
(ii) Eigenvectors corresponding to distinct eigenvalues are orthogonal.
Consequence: there always exists an orthonornal basis of eigenvectors for Rn , i.e. X can
be chosen orthogonal XXT = In (or unitary: XX = In ). Hence
A = XXT

or

A = XX .

4/123

The singular value decomposition (SVD) . Given a matrix A Cnm , n m, there exist

unitary matrices U Cnn , UU = In , and V Cmm , VV = Im , such that


A = UV

where is an n m matrix with ii = i , 1 2 n 0, i = 1, , n, and zero


elsewhere.
This is the singular value decomposition (SVD) of the matrix A; i are the singular values of A
while the columns of U and V
U = (u1 u2 un ), V = (v1 v2 vm )
are called the left and right singular vectors of A, respectively. Since

AA = U

12

..

and

A A = V

n2
|

{z


0

|
}

{z

V ,

these singular vectors are the eigenvectors of AA and A A, respectively. From (1) follows:
Avi = i ui , i = 1, , n

5/123

Example 1. Consider the matrix



A=

1
3

3
1


.

It readily follows that the eigenvalue decompositions of the matrices


AA =

10
6

6
10


and

A A =

10
6

6
10


,

are:
AA = U2 U and A A = V2 V ,
where


1
1 1
U = [u1 , u2 ] =
1
1
2


1
0
=
, 1 = 4, 2 = 2
0
2


1
1
1
V = [v1 , v2 ] =
.
1 1
2

6/123

Remarks:
i2 are the eigenvalues of AA and ui are the corresponding eigenvectors, i=1,2.
i2 are also the eigenvalues of A A and vi are the corresponding eigenvectors, i=1,2.
A = 1 u1 v1 + 2 u2 v2 . Furthermore, A maps v1 7 1 u1 and v2 7 2 u2 (see figure 1).
This shows that the SVD maps the unit circle into an ellipsoid, where Av1 = 1 u1 and
Av2 = 2 u2 give the major and minor axes of the ellipsoid, respectively. The maximum
amplification factor is given by 1 , the largest singular value. (In MATLAB use the
command eigshow).

Figure: Quantities describing the singular value decomposition in R2


7/123

Example 2. Let

1
A = 1 ,
0
clearly A does not have an EVD, since it is not square, but it has and SVD A = UVT :

U = [u1 , u2 , u3 ] =


Example 3. Consider A =

1
2
1
2

2
1
2

3
0

4
0

Let us proceed for its SVD: AA =

0 , V = [v1 ] = 1.
0
, =
0
1
0


. Again, A is not invertible so an EVD is not possible.


25
0

0
0

and A A =

9
12

12
16



1 0
AA = U2 U and A A = V2 V , where U = [u1 , u2 ] =
0 1




1
3
4
1
0
.
=
, 1 = 5, 2 = 0 and V = [v1 , v2 ] =
4 3
0
2
5
Notice how A maps v1 7 5u1 and v2 7 0, so that the ellipsoid in figure 1 is reduced to the
interval [5, 5] on the x-axis. This is because 2 = 0.

8/123

Properties of the SVD


1

Assume that in (1) r > 0 while r +1 = 0; the matrices U, , V are partitioned


compatibly in two blocks, the first having r columns:

U = [U1

U2 ],

1
0

1
1

0
2

nm

and V = [V1

V2 ],

!
..

.
r

> 0, 2 = 0 R(nr )(mr ) ,

where U1 , U2 have r , n r columns, and V1 , V2 have r , m r columns respectively.


Given (1) and (1) the following statements hold.
2

rank A = r .
(
im A = span col [u1 , , ur ] ,
im AT = span col [v1 , , vr ] ,

ker A = span col [vr +1 , , vm ] ,


ker AT = span col [ur +1 , , un ] .

Dyadic decomposition. Decomposition as a sum of r outer products of rank one:


A = 1 u1 v1 + 2 u2 v2 + + r ur vr .

The largest singular value of a matrix A is equal to its induced 2-norm: 1 = kAk2 , where
the induced 2-norm of a A is defined as:
kAk2 = supx6=0

kAxk2
.
kxk2

9/123

Further properties . The short form of the SVD is A = U1 1 V1 , where

U1 Rnr , 1 Rr r , and V1 Rmr , where r is the rank of A.

Moore-Penrose Pseudoinverse: A# = V1 S1
1 U1 .

Connection with LS (Least Squares).

Problem: find x such that kAx bk2 is minimized. Solution:


xLS = A# b =

r
X
ui b
vi
i
i=1

Uniqueness: the outer products are unique, and thus, given a pair of left, right singular vectors
(ui , vi ), i = 1, , r , the only other option for this pair is (ui , vi ). On the other hand, the
columns of U2 are arbitrary subject to the constraint that they be linearly independent,
normalized, and orthogonal to the columns of U1 . Similarly the columns of V2 are arbitrary,
subject to linear independence, normalization, and orthogonality with the columns of V1 . Thus
U2 , V2 are not necessary for the computation of the SVD of A.
In MATLAB the command svd(A) computes the full SVD of A, while the command svds(A,k)
computes a short SVD containing k terms, that is the first k singular values and singular
vectors. The use of the short SVD is recommended for min(n, m)  1.

10/123

Example illustrating LS and TLS


>> % DATA
>> x=[0 1 2 4 5];y=[1 1 3 3 2];A=[x ones(5,1)];
>> % LS solution
>> sol=inv(A*A)*A*y
sol =
2.9070e-01
1.3023e+00
>> % TLS solution
>> [u,s,v]=svd([A y])
u =
9.4604e-02 -5.0562e-01 -5.5333e-01
3.1206e-02
6.5441e-01
1.9007e-01 -2.2358e-01 -6.0493e-01
1.7219e-01 -7.1993e-01
4.1879e-01 -6.0290e-01
3.9970e-01 -5.2409e-01 -1.6340e-01
6.0973e-01 -3.8812e-02
2.9650e-01
7.2748e-01
9.7884e-02
6.3857e-01
5.7391e-01 -2.8321e-01 -4.0679e-01
1.3104e-01
s =
8.3520e+00
0
0
0
2.1344e+00
0
0
0
8.3014e-01
0
0
0
0
0
0
v =
7.9735e-01
6.0200e-01 -4.2835e-02
2.3369e-01 -3.7340e-01 -8.9775e-01
5.5644e-01 -7.0581e-01
4.3841e-01
>> v(:,3)/v(3,3)
ans =
-9.7705e-02
-2.0477e+00
1.0000e+00
>> ezplot(.29*t+1.30,[0,5]); hold; ezplot(.097*t+2.04,[0,5]); plot(x,y,*);
>>
>> errors = [norm(sol) s(3,3)] =
[1.3344e+00
8.3014e-01]

11/123

12/123

Optimal approximation in the 2-norm


The problem of approximating a matrix by one of lower rank is as follows.
Problem. Given A Cnm , rank A = r n m,

Find X Cnm , rank X = k < r , such that

the 2-norm of the error matrix E = A X, is minimized

Remark.
Given A of rank r, for all X of rank less than or equal to k, there holds
kA Xk2 k+1 (A).
In other words, take any matrix X of rank k, then the approximation error can never be
smaller than k+1 (A). Finding the best rank k approximant of a matrix is a non-convex
(DIFFICULT) optimization problem, but surprisingly, the SVD provides an explicit
solution!

13/123

Solution. Schmidt, Eckart, Young, Mirsky. With the notation introduced above
min

X, rankX=k

kA Xk2 = k+1 (A)

provided that k > k+1 . A (non-unique) minimizer X is obtained by truncating the


dyadic decomposition (??) to contain the first k terms:
X = 1 u1 v1 + 2 u2 v2 + + k uk vk
Using the approximation (2), A Cnm of rank r n is approximated by a matrix of lower
rank k < r , by eliminating the r k smallest singular values, k+1 r :
b = Uk
b k V ,
A
k

where:

k =
Uk = [u1 , ..., uk ] Cnk , Vk = [v1 , ..., vk ] Cmk ,

1
..

.
k

kk
.
R

The storage required is thus reduced from


nm

to

n k+k+k m = k (n+m+1) .

14/123

Example. Application of the theory to the approximation of static systems, in particular image
approximation. Any greyscale image is stored as a matrix, whose entries are the levels of grey
corresponding to each pixel. Figure 2 shows a 250 250 image of the earth together with its
lower rank approximants. Notice how a rank 50 approximation is indistinguishable from the
original rank 250 image.
Image approximation

10

50

Exact image

Figure: Earth image approximation by images of lower rank.


Compression of the rank 50 approximant: 62500 2550 ' 24 1
15/123

A lower rank k image approximation is obtained by retaining the k most significant singular
values, as given by (2) and shown in figure 3.
Normalized singular values, log scale

Singular values: original and reduced


0

120

0.5
100
1
1.5
Relative Error

80

60

2
2.5
3

40

3.5
20
4
0

50

100
150
k: Complexity

200

250

4.5

50

100
150
k: Complexity

200

250

Figure: Left pane: Singular values: original and rank 50 approximation. Right pane:
Normalized singular values providing relative approximation error
Furthermore, the singular values provide the
trade-off between accuracy and complexity
This is shown in the normalized singular value plot in figure 3, where specifying a desired relative
error on the y-axis gives the required complexity (rank) of the approximation on the x-axis.

16/123

PART II: Linear dynamical systems in state-space form

17/123

Outline
1

PART I: Approximation of static systems


Matrix factorizations
The Singular Value Decomposition
Properties of the SVD
Optimal approximation in the 2-induced norm

PART II: Linear dynamical systems in state-space form


Dynamical systems and the state
Solving for the response of a dynamical system
Linear dynamical systems: two descriptions
Time domain solution and the matrix exponential
Stability

PART III: Structural Properties of linear systems


Contrability
System controllability: Summary
Infinite controllability gramian
Observability

18/123

Definition. A dynamical system is a device which has memory, that is, its current response
depends both on the current input (excitation) as well as past inputs (past behavior). In the
sequel we will consider dynamical systems described by (ordinary or partial) differential
equations.
Definition. The state is the least amount of information required at time t = t0 so that,
together with the excitation for t > t0 , one can compute the future behavior of the system.
Example. An RLC circuit is a dynamical system:

u is the input or excitation (voltage) applied. We choose y, the current through the circuit, as
the observation or output. Since the system needs two initial conditions, it can be described
using two state variables. The first state variable x1 is the current through the inductor; the
second state variable x2 is the voltage across the capacitor. From Kirchoffs Voltage Law (KVL)
we obtain: u = Rx1 + vL + x2 , together with vL = Lx 1 , yields u = x1 R + Lx 1 + x2 . In
addition, from Kirchoffs Current Law (KCL) follows: x1 = C x 2 . Finally, y = x1 . We thus have:

x 1
x
2
y

=
=
=

RL x1 L1 x2 + L1 u
1
x
C 1
x1

19/123

x1
x2

By defining the state of the system x =


, we can write the equations compactly in

state-space form:



x 1

x 2

| {z } |

RL

L1

1
C

{z

!

!

1
x1
L
u,
+
x2
0
} | {z } | {z }
x

x
 A 


x1

y= 1 0
+ 0 u.

|{z}
x2

|
{z
}

| {z }

D
C

In general, the state equations are a set of coupled first order differential equations of the form

x(t)
= Ax(t) + Bu(t),
while the output equations are algebraic relationships (no derivatives) relating the observations
y with the state x and the input u:
y(t) = Cx(t) + Du(t).
The short-hand notation is

(A, B, C, D) Rnn Rnm Rpn Rpm .

20/123

The first set of equations describes the dynamics of the system while the following one describes
the observation. Notice that one can choose different observations. For example, if the voltage
across the capacitor is chosen:
y2 = x2

y2 =
|

1
{z

C2


}

x + 0 u,
|{z}
D2

or if the voltage across the inductor is chosen:


y3 = vL = Lx 1 = Rx1 x2 + u

y3 =
|

{z

C3


}

x + 1 u.
|{z}
D3

From the above we get a second order differential equation

y+

R
1
1

y +
y = u.
L
LC
L

0 ) = x 1 (t0 ), are needed.


In order to solve this, the initial conditions y(t0 ) = x1 (t0 ) and y(t
Example. The equation of motion of a pointwise particle is m
x(t) = f(t). This is not a state
representation because the equation is not first-order. By defining the vector of states p1 = x,
1
the resulting state equations are p 1 (t) = p2 (t), p 2 (t) = m
p2 = x,
f(t).

21/123

Example: Orbiting satellite. Using Newtons inverse square law, the equations describing the
motion of a pointwise satellite orbiting earth are (in the polar coordinates r , , ):
r 2 cos2 + r 2 k/r 2 + ur /m,

+ 2 sin / cos + u /(mr cos ),


2r /r

+ u /(mr ).
2 cos sin 2r /r

This is a dynamical system of dimension 6. If we define the state x as below, the state
equations become:

x :=

r
r

x2

x1 x4 2 cos2 (x5 ) + x1 x52 k/x12 + ur /m

x4
, x = f (x, u) =

2x2 x4 /x1 + 2x4 x6 sin(x5 )/ cos(x5 ) + u /(mx1 cos(x5 ))

6
x42 cos(x5 ) sin(x5 ) 2x2 x6 /r + u /(mx1 )

If we choose to observe the radial distance of the sattelite from the earth, i.e. y = x1 , we obtain
a 3-input (ur , u , u ), 6-state, 1-output (nonlinear) dynamical system in state space form.

22/123

Example.

For the above circuit we have the following equations


x1 = Lx 2 , u = Ry + x1 , u = RC x1 + Rx2 , y = C x 1 + x2 .
The differential equations are

x 1
x 2

=
=
=

1
x1
RC

1
x
C 2

1
RC

u,

1
x ,
L 1
1
R x1 + R1 u.

and the A, B, C, D matrices are:



A=

1
RC

1
L

C1
0


, B=

1
RC


, C=

R1

, D=

1
.
R

23/123

A mechanical example

The equations are as follows:


1 +
m1 q

d1 q 1

+k1 q1

2 +
m2 q
3 +
m3 q

=
=

d3 q 3

+k3 q3

k12 (q2 q1 )

k12 (q1 q2 ) + k23 (q3 q2 )


k23 (q2 q3 ) + u

The state variables are chosen as:


x1 = q1 , x2 = q 1 , x3 = q2 , x4 = q 2 , x5 = q3 , x6 = q 3 .
Thus
Ex = Ax + Bu,

24/123

where
x R6 , u = [u1 , u2 , u3 ]T R3 , E = diag [1, m1 , 1, m2 , 1, m3 ] R66 ,
A R66 , B R62 :

0
(k1 + k12 )

A=
k12

0
0

1
d1
0
0
0
0

0
k12
0
(k12 + k23 )
0
k23

0
1
0
0
0

0
0
k23
0
(k3 + k23 )

0
0
0
1
d3

, B =

0
1
0
0
0
0

0
0
0
1
0
0

0
0
0
0
0
1

As observations or outputs, one choice is to measure the positions of the 3 masses, namely,
y1 = x1 , y2 = x3 , y3 = x5 ; this implies y = Cx, where

1
C= 0
0

0
0
0

0
1
0

0
0
0

0
0
1

0
0 R36 .
0

25/123

Solution of state equations . Next we will determine the response of linear dynamical systems.
Simple example. Suppose we would like to solve the following differential equation

y(t)
+ y(t) = u(t),
given the initial condition y0 = y(0 ). We can use the (unilateral) Laplace Transform, which
transforms a differential equation into an algebraic equation:
sY(s) y0 + Y(s) = U(s)

Y(s) =

1
1
y0 +
U(s).
s +1
s +1

Taking the inverse Laplace Transform, we get the time-domain solution




U(s)
y(t) = e t y0 + L1
, t 0,
| {z }
s +1
|
{z
}
y (t)
zi

yzs (t)

where U(s) is the Laplace Transform of the input u(t). Also yzi (t) denotes the zero-input
response and yzs (t) denotes the zero-state response of the system. For instance, if we are
interested in the step response, i.e. u(t) = I(t)), we have
U(s) =



1
y(t) = e t y0 + 1 e t I(t).
s

26/123

We can now generalize this result. Applying the Laplace Transform to the state equation we get

L (x(t))
= L (Ax(t) + Bu(t))
sX(s) x(0 ) = AX(s) + BU(s)

(sI A) X(s) = x(0 ) + BU(s)

X(s) = (sI A)1 x(0 ) + (sI A)1 BU(s)


|
{z
} |
{z
}
Xzi (s)

Xzs (s)

h
i
Y(s) = C (sI A)1 x(0 ) + C (sI A)1 B + D U(s)
|
{z
} |
{z
}
Yzi (s)


Example. Let A =

1
0

(sI A)

1
2

=

Yzs (s)


, then

s +1
0

1
s +2

1
=

1
s+1

1
(s+1)(s+2)
1
s+2

!
.

27/123


Furthermore, suppose that the input u = 0 and the initial condition is x0 =

0
1

, so the

state-vector in the frequency domain is


X(s) = (sI A)1 x0 =

1
(s+1)(s+2)
1
s+2

e t e 2t

x(t) =

e 2t

I(t).

Example: Oscillator. Consider:

x 1
x 2

=
=
=

x2
x1 + u
x1

together with the initial conditions x1 (0 ) = x2 (0 ) = 0. The state-space matrices are







0 1
0
1 0 , D = 0 .
A =
, B =
, C =
1 0
1
To solve for the impulse response, we have u(t) = (t) U(s) = 1.

sI A =
Hence

s
1

1
s


(sI A)

X(s) = Xzs (s) = (sI A)

1
s 2 +1

BU(s) =

s
1

1
s 2 +1
s
s 2 +1

Y(s) = Yzs (s) = C (sI A)1 BU(s) =

1
s


=

1
s 2 +1

s
s 2 +1
s 21+1


x(t) =

sin(t)
cos(t)

1
s 2 +1
s
s 2 +1

!
.

I(t)

y(t) = sin(t) I(t).

If we eliminate x2 , we get a second order differential equation


y(t) + y = u. The characteristic
polynomial of this differential equation has roots j, which indicates an oscillatory behavior.
28/123

Recall the state space description of a linear dynamical system, characterized by the state
equations (1) and output equations (2):

x(t)

Ax(t) + Bu(t)

(1)

y(t)

Cx(t) + Du(t),

(2)

where
(A, B, C, D) Rnn Rnm Rpn Rpm ,

(3)

and
x Rn is the state,

u Rm is the input,
p

y R is the output,

= number of states variables,

= number of input variables,

= number of output variables.

This is also called the internal description of a system, or an I/S/O description.


Example. In the satellite case, u1 , u2 , u3 , the tangential, radial and azimuth thrusts
respectively are the 3 inputs (excitations) to the system and there are n = 6 states. If we
choose to observe r , the distance from the earth, and the radial velocity r then the number of
outputs is p = 2.

29/123

Recall now the I/O description, characterized only by the input-output relationship:
y(t) = (h u)(t), t R,
where h(t) is the impulse response of the system.
Next, we will explore the relationship between the I/S/O and I/O descriptions via the power of
Laplace transform and a new linear algebra tool, the matrix exponential.
Frequency domain solution . The solution for the state and output equations of the system in
the Laplace domain is:
X(s)

Y(s)

=
=

(sI A)1 x(0 ) + (sI A)1 BU(s)

(4)

C(sI A)1 x(0 ) + (C(sI A)1 B + D)U(s)

(5)

Yzi (s) + H(s)U(s)

Suppose the system has no initial conditions, i.e. x(0 ) = 0. Then Y(s) = H(s)U(s) where the
transfer function of the system is:
H(s) = C(sI A)1 B + D.
It therefore follows that the Laplace transform of the impulse response is L(h(t)) = H(s). If
Y(s)
m = p = 1, i.e. we are dealing with a single input - single output system, then H(s) = U(s) =
p(s)
,
q(s)

is a rational function. Furthermore it is proper rational as deg p deg q.

30/123

Example. Consider the oscillator example:


x 1
x 2
y

=
=
=


x2

0
x1 + u
A=
1

x1

H(s) = C(sI A)1 B = [1 0]

1
0

s
1


, B=

1
s

0
1

1 

, C = [1

0
1


=

0], D = 0,

1
.
s2 + 1

The impulse response (the response to (t)) is therefore L1 [H(s)] = h(t) = sin(t), t 0.
In general, given a proper transfer function H(s) =
p(s)

q(s)

p(s)
,
q(s)

where:

n1 s n1 + + 1 s + 0 , i R,

s n1 + n1 s n1 + + 1 s + 0 , i R,

the relationship between input in output is given by the differential equation:


y(n) (t) + n1 y(n1) (t) + ... + 1 y(1) (t) + 0 y(t) = n1 u(n1) (t) + 1 u(1) (t) + 0 u(t).
By formally replacing the complex variable s by differential operator
q(s)Y(s) = p(s)U(s)

d
dt

we obatin:

d
d
q( dt
)y(t) = p( dt
)u(t)

31/123

Example. Given the system:


x 1
x 2
y

=
=
=


x2

0
2x1 3x2 + u
A=
2

x1 x2 ,

1
3


, B=

0
1


, C = [1 1], D = 0,

its transfer function is:


H(s) = C(sI A)1 B = [1 1]

s
2

1
s +3

1 

0
1


=

s 1
.
s 2 + 3s + 2

Hence, p(s) = s 1, q(s) = s 2 + 3s + 2, which implies that the I/O differential equation
describing the system is:
d2
d
d
y(t) + 3 y(t) + 2y(t) =
u(t) u(t).
dt 2
dt
dt

32/123

The Matrix Exponential . Recall the definition of the (scalar) exponential:


ex = 1 +

1
1
1
x + x2 + + xk + .
1!
2!
k!

Given a matrix A Rnn , we define its exponential making use of the same formula:
e At = I + A

t
t2
tk
+ A2
+ . . . + Ak
+ .
1!
2!
k!

Important property: its derivative is a multiple of the function itself:


d At
(e ) = Ae At = e At A.
dt
However, because matrices do not commute in general:

Example. Take A =
e

Bt


=

1
t

0
1

0
0

1
0


. Therefore, e


and B =

At Bt


=

0
0

1 + t2
t

t
1

0
1

e A+B 6= e A e B .

. Since A2 = 0 and B2 = 0, it follows that e At =


. Also A + B =

0
1

1
0

1
0

t
1


. One can easily see that the

eigenvalues of A + B are 1, so we would expect the matrix exponential e (A+B)t to be a function of e t and e t , and thus
e A+B 6= e A e B .

It can be shown that equality holds if and only if the exponents commute: AB = BA.

33/123

Computation of the matrix exponential.


2

Power Series Method. Use the definition: e At = I + A 1!t + A2 t2! + . . . + Ak tk! + .


Using the Laplace Transform. L e At = (sI A)1 . Hence


e At = L1 (sI A)1 .
This method is good for small matrices.

Eigenvalue Decomposition. Let A = VV1 , where

1
..

.
n

At

= Ve V

, where e

e 1 t

..

.
e n t

The matrix V is made up of the right eigenvectors while V1 is made up of the left
eigenvectors.
T
w1
T

 1
w2
V = v1 v2 . . . vn , V = .
..
wnT

34/123

The exponential can thus be expressed as:


e At = e 1 t v1 w1T + . . . + e n t vn wnT
This decomposition can be used if one needs to solve a problem like: Find x0 such that
e At x0 behaves like e k t . The answer is to choose x0 a multiple of vk . In general, if x0 is a
linear combination of vi , vj , vk , then e At x0 is a linear combination of vi , vj ,vk .
This method is better for larger matrices.
Example. Take

A=

0
0

1
0

A2 =

0
0

0
0

e At = I + At =

1
0

t
1

Let us check this result using the inverse Laplace transform:


L1 [(sI A)1 ] = L1 [

s
0

1
s

1

] = L1 [

1
s2

s
0

1
s



1
]=
0

t
1


.

Note: The MATLAB command for computing the matrix exponential e A is expm(A) and NOT
exp(A), which only computes the entrywise exponential of A.

35/123


Example. Let A =

0
2

1
3

L1 (sI A)1

L1

s
2

. We compute the matrix exponential using two methods.

1
s +3

1
s 2 + 3s + 2

1

s +3
2

1
s

!
= L

s+3
(s+1)(s+2)

1
(s+1)(s+2)

2
(s+1)(s+2)

s
(s+1)(s+2)

We now take the inverse Laplace transform of each entry using partial fraction:
1
]
(s + 1)(s + 2)
s
(2, 2) : L1 [
]
(s + 1)(s + 2)

(1, 2) : L1 [

2
]
(s + 1)(s + 2)
s +3
(1, 2) : L1 [
]
(s + 1)(s + 2)
(2, 1) : L1 [

1
1

] = (e t e 2t )I(t),
s +1
s +2

L1 [

d
(1, 2) = (e t + 2e 2t )I(t) + (e t e 2t )(t)
dt

(e t + 2e 2t )I(t) + 0 = (e t + 2e 2t )I(t),

(2e t + 2e 2t )I(t),

L1 [

(2e t e 2t )I(t).

s
3
+
]
(s + 1)(s + 2)
(s + 1)(s + 2)

36/123

So that
2e t e 2t

L1 [(sI A)1 ] =

2e t + 2e 2t

e t e 2t

e t + 2e 2t

I(t).

The eigenvalues of A are the roots of the characteristic polynomial of A:


A (s) = det(sI A) = (s + 1)(s + 2) = 0

The right and left eigenvectors of A are V =
the matrix exponential is indeed:
e At


=

1
1

1
2



e t
0

1
1

1
2

0
e 2t

1 = 1, 2 = 2.
V1 =

I(t)

2e t e 2t

e t e 2t

2e t + 2e 2t

e t + 2e 2t

2
1

2
1
1
1

1
1


, so that

I(t).

37/123

Time domain solution of state and output equations . Equation (4) gives the solution of the
state equations in the frequency domain. The time domain solution is:
x(t) = L1 [(sI A)1 ]x(0 ) + L1 [(sI A)1 ]B u(t).
As already mentioned: L1 [(sI A)1 ], is the matrix exponential. Therefore the time domain
equivalent of (4) and (5) is:
x(t)
y(t)

=
=

e At x0 +

Ce At x0 +

e A(t ) Bu( ) d, t 0,

(6)

Ce A(t ) Bu( ) d + Du(t), t 0.

(7)

p(s)

The fractional form of the transfer function H(s) = q(s) allows us to identify the poles (modes)
of the system, as the roots of the denominator polynomial q(s). These quantities are the same
as the roots of the characteristic polynomial of A, XA (s) = det(sI A) = 0, i.e. the
1
eigenvalues of A. This holds because the matrix (sI A)1 =
adj (sI A). The poles
det(sIA)
are at the heart of system behavior, in particular they determine system stability.
In case of the oscillator, the characteristic polynomial of A is: XA (s) = det(sI A) = s 2 + 1 =
q(s), and its roots are 1,2 = j. These are the eigenvalues of A, the poles of the system and
the roots of q(s). These indicate that the system has an oscillatory behavior.

38/123

Stability . If the input u = 0, the state is just x(t) = e At x(0 ). In this case stability means
that x(t) 0 for t , x(0 ). Given the expression of the matrix exponential in terms of
the right and left eigenvectors of A, we get
x(t) = e At x(0 ) = e 1 t v1 w1T x(0 ) + . . . + e n t vn wnT x(0 )
Thus for this expression to decay to 0 for all initial conditions, we must have <(i ) < 0. We
conclude that in order for a system to be stable, all the poles (or equivalently, all the eigenvalues
i of the A matrix) should be in the left half of the complex plane.
The above expression also implies that if x(t) e i t , x(0 ) must be as a multiple of vi . More
generally, if we wish x(t) e i t + e j t , then x(0 ) must be a linear combination of vi and vj .

39/123

Example. Consider

0
A= 1
0

1
0
1

0
1
0 , B = 0 , C =
0
0

, D = 0.

We wish to compute the matrix exponential, the impulse response of the system, the transfer
function and determine stability.
For computing the matrix exponential, we can use the EVD (Eigenvalue Decomposition)
method. The matrix I A is

I A = 1
0

0
0 .

The characteristic polynomial is


det(I A) = (2 1),
so the poles of the transfer function are 1 = 0, 2 = 1, 3 = 1. The eigenvectors of A are

0
1
1

0
1
1 .
v1 =
, v2 =
, v3 =
1
1
1

40/123

Therefore, the matrices V and are

0
V= 0
1

1
0
1 , =
1

1
1
1

1
1

The inverse of V gives the left eigenvectors:

0
= 0
1

1
1
1

2
1
1

1
1
=
2
1
1
|

T
w1
0 2
1
0 = w2T .
1
0
w3T
{z
}
adj (V)

Therefore the decomposition of A as a sum of rank 1 matrices is


T

A = 0v1 w1 + 1v2 w2 + (1)v3 w3 = 0

0
0
1

0
0
0

0
0
1

!
+1

1/2
1/2
1/2

1/2
1/2
1/2

0
0
0

!
+ (1)

1/2
1/2
1/2

1/2
1/2
1/2

0
0
0

!
.

So from this decomposition, the matrix exponential e At is obtained by taking the exponential of
each eigenvalue, without touching the rank 1 matrices.

e At

0
= e 0t 0
1

0
0
0

0
1/2
0 + e t 1/2
1
1/2

1/2
1/2
1/2

0
1/2
0 + e t 1/2
0
1/2

1/2
1/2
1/2

0
0 .
0

41/123

From the above, we see that a matrix and its exponential have the same eigenvectors while the
eigenvalues of e At are the exponentials of the eigenvalues of A.
Once we computed the matrix exponential e At , we can compute the impulse response of the
system:
1
1
h(t) = Ce At B = (1 + e t + e t )I(t).
2
2
The transfer function is:
H(s) = C(sI A)1 B =

1
1/2
1/2
1
= +
+
.
s(s 2 1)
s
s 1
s +1

This system is not stable because the exponential e t tends to as t . Nevertheless, does
there exist an initial condition such that the state behaves nicely?
If we choose x(0 ) = kv3 , namely a multiple of the eigenvector corresponding to the stable
eigenvalue, the zero-state part of the state would be xzs (t) = e At x(0 ) = ke t v3 . This is
because by choosing the initial condition in the span of the eigenvectors corresponding to
good modes the response will be a linear combination of these good modes.

42/123

u = 0 (zero-input part).
The system is stable provided x(t) 0 as t .
But x(t) = e At x(0 ) =

Pn

i=1

e i t vi wiT x(0 ), so e i t 0 as t

<(i ) < 0

u 6= 0, x(0 ) = 0 (zero-state part).


In this case stability is called BIBO (bounded input, bounded output) stability.
BIBO stable system khk1 < <(i ) < 0
Recall that the 1-norm is khk1 =

R
0

|h(t)| dt.

Proof that the system is BIBO stable khk1 < .


: The output is the convolution of the input with the impulse response of the system.
t

Z
y(t) =
0

h(t )u( )d.

By the triangle inequality, we have that


|y(t)|

Z
0

|h(t )| |u( )| d.

43/123

So
max |y(t)| kuk
| {z }

Z
0

|h(t )d |

kyk

kyk
khk1 < .
kuk

so the system is BIBO stable.


: Suppose that khk1 = so take the input u(t) to be
(
u(t) =

|h(t)|
,
h(t)

0,

h(t) 6= 0
otherwise

Compute the output at time 0:


Z

h( )

h( )u( )d =

y(0) =

|h( )|
d =
h( )

Z
0

|h( )| = khk1 = .

The above is a contradiction, since we assumed that the system was BIBO stable but for a
bounded input, we obtained that the output at time 0 is infinite. Therefore, khk1 < .

44/123

The construction of state . Consider a SISO system and its transfer function
H(s) = C(sI A)1 B =

p(s)
.
q(s)

As already discussed, the differential equation relating the input and the output readily follows:

q

d
dt


y(t) = p

d
dt


u(t)

where

q(s) = s n + n1 s n1 + . . . + 1 s + 0 ,
p(s) = n1 s n1 + . . . + 1 s + 0 .

Our concern now is to re-construct the state, i.e., determine the state-space matrices A, B, C,
with this differential equation as starting point. We will consider two cases.
 
d
deg p = 0 q dt
y(t) = 0 u(t) y(n) (t) + n1 y(n1) (t) + . . . + 1 y(1) (t) + 0 y(t) = 0 u(t).
In this case we can define the states to be the derivatives of the output

x1

x2

xn

=
=
..
.
=

y
y(1)

y(n1)

x 1

x 2

x n1

n
y

=
=
..
.
=
=
=

x2
x3
xn
0 x1 1 x2 n1 xn + u,
0 x 1

45/123

This leads to the following state-space representation:

A=

0
0
...
0
0

1
0
...
0
1

...
...
...
...
...

0
1
...
0
2

0
0
...
1
n1

, B =

0
0
0
..
.
1

, C =

...

Example. Consider the differential equation characterizing the oscillator


y + y = u. Define the
Therefore we have
state variables x1 = y and x2 = y.

x 1
x 2

=
=
=

y
x1 .

x2 ,
u y = x1 + u,

=
=

And the state-space matrices are



A=

0
1

1
0


, B=

0
1


, C=

46/123

d
0 < deg p n 1. In this case we introduce the new variable w such that q( dt
)w = u. This
d
)w. By defining the state variables
implies y = p( dt

x1

x2

xn

=
=
..
.
=

w
w(1)
w(n1)

we obtain the following state-space matrices A, B, C:

A=

0
0
...
0
0

1
0
...
0
1

0
1
...
0
2

...
...
...
...
...

0
0
...
1
n1

, B =

0
0
0
..
.
1

, C =

...

n1

Example. Consider an alternative differential equation describing an oscillator


y + y = u.
+ w = u and y = w.
With x1 = w, x2 = w,
this
Define the auxiliary variable w so that w
yields





x 1 = x2

0 1
0
x 2 = x1 + u A =
, B=
, C= 0 1 .
1 0
1

y = x2

47/123

PART III: Structural properties of linear systems

48/123

Outline
1

PART I: Approximation of static systems


Matrix factorizations
The Singular Value Decomposition
Properties of the SVD
Optimal approximation in the 2-induced norm

PART II: Linear dynamical systems in state-space form


Dynamical systems and the state
Solving for the response of a dynamical system
Linear dynamical systems: two descriptions
Time domain solution and the matrix exponential
Stability

PART III: Structural Properties of linear systems


Contrability
System controllability: Summary
Infinite controllability gramian
Observability

49/123

Structural properties

Controllability
Controllability deals only with the input u and the state x (i.e., the output y does not
enter into consideration) and explores the issue of influencing x by manipulating u.

Observability
This is the dual concept of controllability. Observability deals only with the state x and
the output y (i.e., the input u is not relevant) and it investigates the issue of deducing x
by observing (measuring) y.

In the earlier RLC example, suppose we would like the system to be driven to the state in which
x1 = 1kV and x2 = 0.1mA.
exists an input u which
 The
 question of interest iswhether there

0
1kV
will steer the state from
, at time t = 0, to
, at some time t = T .
0
0.1mA

50/123

Concept of controllability
and an input u
such that the transition
A state
x is controllable from 0 if there exists a time T
:
from state 0 to state
x is possible, under the influence of the input u

(0,0)

(x,T)

Recall that the solution of the state equation is


t

Z
x(t) =

x(t) = e At x0 + 0t e A(t ) Bu( )d

. If

x0 = 0:

e A(t ) Bu( )d

, we have
At time T
) =
x(T

e A(T ) B
u ( )d = x.

We can thus define the controllability space:


(

Xcontr = {all controllable states} =


, u
=

:
x 3 T
x=x T

e A(T ) B
u( )d

51/123

In order to characterize Xcontr , we need the controllability matrix:



Rn (A, B) = B, AB,

Main result .


, An1 B Rnnm

Xcontr = span col Rn (A, B)

Corollary . The system is completely controllable provided that

Xcontr = Rn rank Rn (A, B) = n


If m = 1, this is equivalent to det Rn 6= 0.
Consequences
1

Xcontr is a linear subspace of Rn . This means that if x1 and x2 are controllable states,
then any linear combination 1 x1 + 2 x2 , 1 , 2 R, is also controllable.

Xcontr is generated by means of linear combinations of the columns of Rn (A, B).

. The
or the final time T
Controllability depends only on A and B, not on the input u
latter quantities are not unique: there are several inputs which drive the system to state
x
.
in different times T

52/123

Example. In the earlier RLC example:

n = 2, m = 1, A =

1
RC
1
L

C1
0

!
, B=

1
RC

(RC1 )2

AB =

1
RCL

!
.

Hence he controllability matrix is

R2 (A, B) = [B, AB] =

1
RC

(RC1 )2

1
RLC

The two columns of R2 are linearly independent, and therefore


and the system is completely controllable.

!
.

span col R2 (A, B) = R2 ,

53/123

Example.
y

RL

RC

?x2

?
x1

?
KCL and KVL gives

u
u

=
=

x1 + RC C x 1
x2 RL + Lx 2

x 1
x 2

R 1 C x1 +

RLL x2

1
RC C

u,

1
u.
L

and the state-space matrices are

A=

R 1C
C

0
RLL

!
, B=

1
RC C
1
L

!
.

54/123

Therefore, the controllability matrix is

R2 (A, B) =

1
RC C

(R

RL2L

1
L

1
2
C C)

det R2 (A, B) =

1
CLRC

RL
1

CRC
L


.

1
= RLL
Hence complete controllability is lost when this determinant is zero, i.e. when
CRC


is equal to the time constant of
in other words, if the time constant of the RC branch CR1
C
 
RL
the RL branch L (i.e., if the circuit resonates). In this case
!
1

Xcontr = span col

x_2

RC C
1
L

* R2 .

X^contr

x_1

The controllable space is thus a line through the origin with slope CRLC . This figure shows that the freedom in how to choose the state was lost since it has
to move on the line X contr . If you choose the first
component of the state x1 , you must choose the second component x2 to be equal to x1 .

55/123

Proof: Xcontr span col Rn (A, B)


)

x = x(T

Z
=

e A(T ) B
u( )d


2

) + A2 (T ) + B
I + A(T
u( )d
2

Z T
Z T
(T )2
)
( )d +AB
( )d +
u
(T
u( )d +A2 B
u
2
0
0
{z
}
|
{z
}
{z
}
|

Z
=
0

B
0

x= B

AB

A2 B


0
1

..
.

where

i Rm .

This shows that


x is a linear combination of the columns of Ak B, k 0. The desired result
follows from the Cayley-Hamilton theorem.

56/123

The Cayley-Hamilton Theorem: Every matrix satisfies its own characteristic polynomial.
The characteristic polynomial of a (square) matrix A is
A (s) = det(sI A) = s n + n1 s n1 + + 1 s + 0 . The CH theorem says that
An + n1 An1 + + 1 A + 0 I = 0.

A (A) = 0

Therefore, the nth power of A can be expressed as a linear combination of lower powers of A:
An = n1 An1 1 A 0 I.
Consequently all powers Ak , k n can be expressed as a linear combination of powers Ak , for
k = 0, 1, , n 1.


Example. Given
will check that

A=

0
2

1
3

its characteristic polynomial is A (s) = s 2 + 3s + 2, We

A2 + 3A + 2I = 0. Since


2
6

3
7
{z

A2


+3

0
2

A2 =

{z
A

1
3

0
2

1
3


+2



1
0

0
1
{z
I

0
2

1
3


=


=

0
0

0
0

2
6

3
7


.

57/123

Example: Orbiting satellite.


The satellite moves in a plane; polar coordinates are chosen, i.e. the radial distance r , and the
angle with respect to the x-axis.
r

r


r
ur
, while the input vector is u =
, where ur is the radial thrust
The state is x =

u

and u is the tangential thrust. The A and B matrices (obtained by linearizing the equations
presented earlier) are

0
3

A=
0
0
Since

A =

3
0
0

0
1
2

2
, B =

1
0

1
0
0
2

0
0
0
0

0
0
0

2
0
0
3
3
, A =

0
6
0

0
1
0
0
1
0
0
2

0
0
.
0
1
0
0
0
0

0
2
,
4
0

58/123

the controllability matrix is

R4 (A, B) =

1
0
0
2
| {z } | {z
0
1
0
0

0
0
0
1

AB

0
2
1
0

0
1
2
0
}|

{z

2
0
0
4

A2 B

1
0
0
2
}|

0
2
4
0
{z

A3 B

R48 .

The first 4 columns of R4 are linearly independent and hence its rank is 4 (full). Thus
Xcontr = R4 .
Conclusion: If both jets are functioning, the system is completely controllable.

59/123

Malfunction scenario. We now assume that one of the inputs fails.

0
0

ur = 0 so B2 =
0 = B(:, 2) (in MATLAB notation). In this case, the controllability matrix is
1

R4 (A, B2 ) =

B2 , AB2 , A B2 , A B2

0
=
0
1

0
2
1
0

2
0
0
4

0
2
44
R
.
4
0

The system is completely controllable since the rank of R4 (A, B2 ) is 4.

1
If u = 0, the B-matrix becomes B1 =
0 = B(:, 1) (in MATLAB notation). In this case, the
0

controllability matrix is

0

 1
R4 (A, B1 ) = B1 , AB1 , A2 B1 , A3 B1 =
0
0

1
0
0
2

0
1
2
0

1
0
R44 .
0
2

60/123

The rank of R4 (A, B1 ) is 3, and hence complete controllability is lost. The controllable space
is

0
1
0
0
1 0

1
0 1
1
0 0
.
Xcontr = span col R4 (A, B1 ) = span col
= span col
0
0
0 2
0 1
0 2
0
0 2 0
This gives a basis for Xcontr . An equivalent expression for the controllable space is


contr

.
,
,
,

R
X
= span col

2
This shows that we lost the freedom of choosing the last component of the state, namely the
since it is a multiple of the first component, namely the radius r . Therefore,
angular velocity ,
we only have two options in choosing the state: we can either choose r , r and (in which case

= 2r ), or we can choose r , and (in which case r = 21 ).

To obtain a pictorial representation of


lie
Xcontr , we set r = 0; then r , and ,
on a plane in R3 . This shows that we lost
the ability to move everywhere; instead we
are confined to a plane.

2
1

61/123

Trajectory through states x1 and x2


Recall that the state of a system can be expressed in terms of its zero-input and zero-state
componets:
Z t

x(t) = e At x(0 ) +
e A(t ) B
u( ) d.
0

We say that a system trajectory passes through states x1 and x2 if there exist time T , and an
input u such that:
Z T
x2 = e AT x1 +
e A(T ) Bu( ) d.
0

If we define now state x21 = x2

e AT x

1,

we obtain:

Z
x21 =

e A(T ) Bu( ) d.

We conclude that a trajectory passes through x1 abd x2 provided that the composite state x12 is
controllable from the zero state.
Consequence: this always holds if x1 and x2 are controllable states, i.e: x1 , x2 Xcontr . Since
we know x2 Xcontr and Xcontr is a linear subspace, there only remains to show
e AT x1 Xcontr .

62/123

Recall that:

Xcontr = span col [B, AB, , An1 B] AXcontr = span col [AB, A2 B, , An B] Xcontr .
2

Since: e AT = I + AT + A2 T2 +

e At x1 AXcontr e At x1 Xcontr .

e At x1 = x1 + Ax1 T + A2 x1 T2 +

We thus conclude that x2 e AT x1 Xcontr , so for any controllable x1 and x2 , there always
exists a trajecotry passing through them. Therefore, one can always go from one controllable
state to another controllable state.
Controllability - alternative perspective involving time. Until now, we have described the
controllability subspace as:

Xcontr = span col [ B, AB, A2 B, , An1 B ].


This description is completely algebraic and does not involve inputs or time (i.e. it only
describes that a state
x is controllable if it can be expressed as a linear combination of the
columns of the controllability natrix). To bring the input and time back into the picture, we
need the concept of controllability gramian.

63/123

Gramians . In general, a gramian G is defined as follows: given x1 , x2 , , xk Rn ,


G = x1 x1 + x2 x2 + + xk xk .


Notice that above, each term
is an exterior product of the form

a matrix of rank 1. Furhtermore, if we define:


xi xi

x1

XX =

x2

x1

x2

xk

xk

x1

 x2
..
.
xk

k
X

xi xi .
=

i=1

Properties of gramians. Gramians satisfy the following properties:


1

G = G .

G 0, i.e. G is positive-semi-definite i (G) 0, i.

64/123

i.e.

The notion positive-definite describes one concept of positivity for matrices. Formally:
Definition. A matrix G is positive (semi) definite provided v Gv > 0 ( 0), for all vectors
v 6= 0.
The gramian G = XX indeed satisfies the above property:
v Gv = v X X v = k y k22
|{z} |{z}
y

0.

Connection between positive definiteness and the eigenvalues of G. Suppose G = VV


(since G is symmetric, is real and diagonal). Given z Rn
z Gz = z V V z =
|{z} |{z}
w

n
X

i wi2 .

i=1

This quantity is > ( 0) provided i > 0 (i 0). Therefore


G is positive definite (semi-definite) all its eigenvalues are positive (non-negative)

65/123

Example.


M

2
1

1
2

2
2

2
2

2
3

3
2

2
1

has eigenvalues 1 = 3, 2 = 1 M positive definite


has eigenvalues 1 = 4, 2 = 0 M positive semi-definite
has eigenvalues 1 = 5, 2 = 1 M indefinite
1
2


has eigenvalues 1 = 3, 2 = 1 M negative definite

Notation: We denote a positive (semi) definite matrix G by G > 0 (G 0), respectively.


This does not mean that all the entries of G are positive!

66/123

Property of symmetric matrices.


For M = M Rnn , and x Rn , x 6= 0, the Rayleigh-Quotient is: M (x) =

x Mx
.
x x

Then

min (M) M (x) max (M), x 6= 0.

Proof. Consider the EVD

M = VV , where the eigenvalues are ordered:

max (M) = 1 2 n = min (M).


M (x) =

Then

1 y12 + + n yn2
x Mx
x VV x
y y
=
= =
,
x x
x VV x
y y
y12 + + yn2
| {z }I

where y = V x. This relationship yields:


1 y12 + +n yn2 = (y12 + +yn2 ) (M 1 )y12 +(M 2 )y22 + +(M n )yn2 = 0,
which implies M 1 = max (M), and

M n = min (M).

67/123

Controllability Gramian
We now have the tools to study system controllability, involving the dependence on time and the
input. For this purpose we define the controllability gramian of a system:

P(T ) =

e At BB e A t dt

Properties:
1

P(T ) = [P(T )] .

P(T ) 0.

Main property
The columns of the controllability matrix and of the controllability gramian span the same space:

Xcontr = span col Rn (A, B) = span col P(T ).


This result describes the controllability subspace in terms of a time dependent basis.
Corollary
The system is completely controllable

rank Rn (A, B) = n

P(T ) > 0, T > 0.

68/123

> 0, there exists w,


such that
Given
x Xcontr , for any time T
)w.

x = P(T
We now define the input function
(t) = B e A
u

t)
(T

det(P)6=0

B e A

t)
(T



) 1
P(T
x

In the sequel we will use the energy of vector time-function f(t) Rk , on the interval [0, T ]:
ET ,f = kfk2 =

f (t) f(t) dt .

Results
.
steers the system from state zero at time zero, to state
Input u
x at time T
The energy of this input is


)w =
) 1
ET ,u = k
uk2 = w P(T
x P(T
x,
where the second equality holds if the system is completely controllable.

).
is a minimal energy input achieving the transfer (0, 0) (
u
x, T

69/123

Degree of controllability and classification of states


State x1 is called more easily controllable that state x2 , at time T , if the associated minimal
energy inputs u1 , u2 satisfy ET ,u1 < ET ,u2 . In other words, either
w1 [P(T )] w1 w2 [P(T )] w2

or

x1 [P(T )]1 x1 x2 [P(T )]1 x2 ,

the later relationship holding if the system is completely controllable.


This provides an ordering of the states of the system from the point of view of controllability.
Result: ordering of states according to their degree of controllability.
Given time T , the state x and a minimal energy input u, the associated normalized minimal
energy satisfies
ET ,u
1
1

.
max (P(T ))
x x
min (P(T ))

Conclusion. States which require large amounts of enegry to be reached are candidates for elimination. For instance, if the
system is not completely controllable, P(T ) has a 0 eigenvalue and the corresponding eigenvector requires infinite energy to be
reached (i.e. it cannot be reached). The minimal energy required to reach any (normalized) state is in the interval:
)1 ), max (P(T
)1 )] = [1 (P(T
)), 1 (P(T
))].
[min (P(T
max
min
This also shows that states which are easiest/most difficult to reach (i.e. require small/ large energy to go to) are in the span of
).
the eigenvectors corresponding to large/ small eigenvalues of P(T

70/123


Example. Consider the system: x(t)
= Ax(t) + Bu(t), where n = 2, m = 1,


A=

0
0

0
0

R2 ,

and B
is arbitrary. In order to determine controllability for the system, we compute the
controllability matrix
R2 (A, B) = [B, AB] = [B, 0] R22
and hence dim Xcontr = 1.
P(T ) =

Suppose that B =
Therefore

P(T ) =

Xcontr = span B,

The controllability gramian is:


T

Z
0

e At BB e A t dt =
|{z}
| {z }
I2

I2

BB dt = T BB .







3 
9 21
3 7 =
, so BB =
0.
7
21
49

9T 21T
, is positive semi-definite, with eigenvalues 0 and the
21T
49T

3
7


sum of the diagonal elements 9 + 49 = 58. Consequently

span col P(T ) = span col

3
7

= span col B,

T > 0.

Next, given the controllable state


x, we seek an input which will steer the system from state 0 at
. This expression is
time 0 to state
x at time T

(t) = B e A
u

t)
(T

where

)w.

x = P(T

71/123

)w.
P(T

The energy associated with this input is w


Suppose now that we want to drive the
system from state 0 at time 0 to state B at time T . The input which achieves this is
(t) = B w,
where w
satisfies B = T BB w.

u
may be chosen as w
=
Thus w
(t) =
u

B
T (B B)

1
T

B
T (B B)

(note that B B is a scalar). Therefore, the input is

. Hence the energy is

)w
P(T
=
E = k
uk22 = w

B
1
B
T BB
= .
T (B B)
T (B B)
T

Note that as T 0 (i.e., we want to reach the desired state immediately), the input
(t) (t) and the energy E .
u
u(t)
1/T_2

1/T_1

T_2

T_1

72/123

Example. The system matrices are given by



A=

0
0

1
0


, B=

0
1


, C=

This system represents a double integrator (i.e., the output is the input integrated twice). Check
this by writing the state differential equations


x 1
x 2

=
=

x2
u

x1
x 2

=
=

x 2
u


x1 = u

ZZ
y = x1 =

u.

The controllability matrix is


R2 (A, B) =

B,

AB


=

0
1

1
0


.

Since it has rank 2, we conclude that Xcontr = R2 . To compute the controllability gramian,
we need


 

1 t
0
t
e At B =
=

0 1
1
1
3



Z T
Z T
Z T 2
T
T2

t 
t
t
At
A t
3
2

t
1
dt =
P(T ) =
e BB e
dt =
dt =
.
1
t
1
T2
0
0
0
T
2

73/123

The gramian has rank 2, as expected, since


T > 0.

T3
T
3

det P(T ) =

T2 T2
2 2

T4
12

6= 0, for all


1
0

Suppose that we wish to drive the system from state 0 at time 0 to state x =


at time T .

The minimal energy input is


(t) =
u

T t
|
{z

w,

B e A (T t)

satisfies
where w


1
0

T2
2

T3
3
T2
2

=
w

T3
3
T2
2

T2
2

1
0


,

since the gramian is invertible. Hence

1
= 4
w
T
12

T2

T
2

T3
3

T2

1
0


=

12
T4

"

T
2
T2

"
=

12
T3
T62

#
.

Therefore, the minimal energy input is


"
(t) =
u

T t

12
T3
T62

#
=

6(T 2t)
.
T3

74/123

u(t)
6/T^2

T
T/2

6/T^2

R
The associated energy is E = 0T u
(t)2 dt = T123 . It shpuld be noted that, similar to the previous
example, the energy is inversely proportional to time T . Thus if we want to reach a state
x
arbitrarily fast the energy required goes to infinity.

Recap. Assuming that the system is completely controllable, the following expressions hold:
A minimal energy input which steers the system from state 0 at time 0 to state
x at time
is:
T



) 1
(t) = B e A (T t) P(T
u
x
is:
The minimal energy required to reach state
x at time T


) 1
E =
x P(T
x

75/123

Additional example: controllability gramians for the satellite.


A = [ 0, 1,
3, 0,
0, 0,
0, -2,
B = [0
1
0
0

0,
0,
0,
0,

0;
2;
1;
0]

0;
0;
0;
1]

[v,d]=eig(A)
v
[
[
[
[
d
[
[
[
[

=
0,
0,
1,
0,
=
0,
0,
0,
0,

-1/2, -1/2]
1i/2, -1i/2]
1i,
-1i]
1,
1]
0,
0, 0]
0,
0, 0]
0, -1i, 0]
0,
0, 1i]

Because of the presence of a Jordan block, the correct EVD is:

76/123

v1 =
[ 0, -2/3, -1, 0]
[ 0,
0, 0, 1]
[ 1,
0, 0, 2]
[ 0,
1, 2, 0]
d1 =
[ 0, 1, 0, 0]
[ 0, 0, 0, 0]
[ 0, 0, 0, -1]
[ 0, 0, 1, 0]
% Check
A-v1*d1*inv(v1) =
[ 0, 0, 0, 0]
[ 0, 0, 0, 0]
[ 0, 0, 0, 0]
[ 0, 0, 0, 0]
expD=blkdiag(sym([1 t;0 1]),sym([cos(t) -sin(t);sin(t) cos(t)])) =
[ 1, t,
0,
0;
0, 1,
0,
0;
0, 0, cos(t), -sin(t);
0, 0, sin(t), cos(t)]

77/123

expA=v1*expD*inv(v1) =
[ 4 - 3*cos(t),
sin(t),
3*sin(t),
cos(t),
6*sin(t) - 6*t, 2*cos(t) - 2,
6*cos(t) - 6,
-2*sin(t),

0,
2 - 2*cos(t);
0,
2*sin(t);
1, 4*sin(t) - 3*t;
0,
4*cos(t) - 3]

X1=expA*B(:,1) =
sin(t)
cos(t)
2*cos(t) - 2
-2*sin(t)
X=expA*B =
[
sin(t),
2 - 2*cos(t)]
[
cos(t),
2*sin(t)]
[ 2*cos(t) - 2, 4*sin(t) - 3*t]
[
-2*sin(t),
4*cos(t) - 3]
P1= int(X1*X1.,0,t);
P = int(X*X.,0,t);

78/123

sin(t)2
2

sin(2t)

2
4

sin(t)2

2
P1 (t) =

(cos(t) 1)2
sin(2t)
2

t+

sin(2t)
2

(cos(t) 1)2

t + sin(2t)
2
4
sin(2t)
2 sin(t)
2
2

t+

sin(2t)
2

sin(t)2

2 sin(t)

6t + sin(2t) 8 sin(t)

2(cos(t) 1)2

2(cos(t) 1)2

2t sin(2t)

sin(t)

P(t) =

13t
2

3 sin(2t)
4

8 sin(t)

(cos(t)1)(3 cos(t)5)
2

3(t sin(t))2

14 sin(t)

3 sin(2t)
2

11t

(cos(t)1)(3 cos(t)5)
2
5t
2

3 sin(2t)
4

5t 8 sin(t) 3 cos(t) sin(t)


+6t cos(t)
3(cos(t) 1)2

3(t sin(t))2
5t 8 sin(t) 3 cos(t) sin(t)
+6t cos(t)
14t 32 sin(t) 6 cos(t) sin(t)
+24t cos(t) + 3t 3
9t 2
2

6cos(t) 12t sin(t)


4 cos(t) + 10

(4, 1)

(4, 2)

(4, 3)

19t + 3 sin(2t) 24 sin(t)

79/123

>> P10=subs(P1,t,1) =
0.2727
0.3540
-0.2113
-0.5454
0.3540
0.7273
-0.2283
-0.7081
-0.2113
-0.2283
0.1775
0.4226
-0.5454
-0.7081
0.4226
1.0907
>> [v1,d1]=eig(P10)
v1 =
-0.8944
0.1820
-0.1938
-0.3596
0.0000
-0.1618
0.8322
-0.5303
-0.0000
0.8990
0.3459
0.2685
-0.4472
-0.3640
0.3876
0.7192
d1 =
0.0000
0
0
0
0
0.0047
0
0
0
0
0.2202
0
0
0
0
2.0433
% Notice that the eigenvector corresponding to the zero eigenvalue of $P_1$
% is (up to scaling) $[2; 0; 0; 1]$. This was also shown earlier not to
% belong to the span of the columns of $X^{\rm{contr}}$.
>> P0=subs(P,t,1) =
0.4502
0.7767
-0.0754
-0.5834
0.7767
1.8180
0.1461
-0.6340
-0.0754
0.1461
0.3123
0.4896
-0.5834
-0.6340
0.4896
1.5326

80/123

>> [v,d]=eig(P0)
v =
0.8969
-0.2063
-0.3645
-0.1365
0.2174
0.8681
0.1241
-0.4304
d =
0.0356
0
0
0.0645
0
0
0
0

0.0413
0.5925
0.4377
0.6750

-0.3889
-0.7053
0.0873
0.5863

0
0
1.2579
0

0
0
0
2.7552

>> % Energy required for the full system to reach x0 = [0; 1; 0; 0], is:
>> inv(P0) =
23.3169
-8.6291
2.7012
4.4428
-8.6291
4.4817
-3.8806
-0.1910
2.7012
-3.8806
13.1698
-4.7841
4.4428
-0.1910
-4.7841
3.7929
>> x0*inv(P0)*x0 = 4.4817
>> % Energy required by the impaired system for reaching x0 is:
>> % Solve P10*w0 = [0; 1; 0; 0] = x0;
>> w0 = [-6.8764; 8.8282; -29.5740; 13.7528];
>> % Hence the energy for reaching x0 is:
>> x0.*P10*x0 = 8.8282

81/123

Example: the infinite controllability gramian. Given are:



A=

2
1

1
2


, B=

1
0


.


2.0000e + 000 + 1.0000e + 000i
. To find the controllability
2.0000e + 000 1.0000e + 000i




  2t
cos(t) sin(t)
1
e
cos(t)
gramian P we need: w = e At B =
e 2t
.
=
2t
sin(t)
cos(t)
0
e
sin(t)
RT

Hence P(T ) = 0 w (t)w (t) dt


Check stabillity: eig(A) =

P(T ) =
"

9
40

1
20

1
20

1
40

9
40

1
20

4 cos(2T )2 sin(2T )+5


40

40

"

2 cos(2T )+4 sin(2T )


40

e 4T

e 4T

1
20
1
40

4 cos (2T ) 2 sin (2T ) + 5

2 cos(2T )+4 sin(2T )


40

e 4T

2 sin(2T )4 cos(2T )+5


40

e 4T

2 cos (2T ) + 4 sin (2T )

e 4T .
2 sin (2T ) 4 cos (2T ) + 5


2.2500e 001 5.0000e 002
As T , we get Pinf =
. Now check the solution using
5.0000e 002 2.5000e 002


2.2500e 001 5.0000e 002
the lyap command: Plyap =
Pinf = Plyap , up to
5.0000e 002 2.5000e 002
17
10
(machine precision).
=

2 cos (2T ) + 4 sin (2T )

82/123

The EVD of Pinf = V D V , where:





2.2975e 001
9.7325e 001
1.3197e 002
V =
, D=
9.7325e 001 2.2975e 001
0

0
2.3680e 001


.

The energy required to reach the states of this system lies in the interval:

[40 16 5, 40 + 16 5] = [4.223, 75.777].


Thus V (:, 2) is the state easiest to reach, while V (:, 1) is the state most difficult to reach. The
energy required to reach state B is 8, which lies within the above interval. This information is
summarized in the controllability ellipsoid. The largest (smallest) semi-axis (angles 74.4 ,
15.6 , respectively) is the eigenvector of Pinf , corresponding to the smallest (largest)
eigenvalue, that is, the state most difficult and easiest to reach, respectively.

83/123

System controllability: Summary


We have considered dynamical systems described by state and output equations

x(t)
= Ax(t) + Bu(t), y(t) = Cx(t) + Du(t),
where u(t) Rm , x(t) Rn , y(t) Rp , are the input, state, output, respectively, and
A Rnn , B Rnm , C Rpn , D Rpm .
Controllability. The state
x Rn is controllable from the zero state at time zero,

such that
if there exists a time t = T , and an input u
) =

x = x(T

e A(T t) B
u(t) dt.

The set of all controllable states is denoted by




Xcontr = x : controllable for some T and u .
The system is called completely controllable if Xcontr = Rn .

84/123

We define the controllability matrix




Rn (A, B) = B, AB, A2 B, , An1 B Rnnm ,
and the controllability gramian
P(T ) =

e At BB e A

dt Rnn .

The latter is symmetric and positive semi-definite


because it is the sum of symmetric and
P
positive semi-definite matrices: P = t w(t)w (t), where w(t) = e At B.
Main Results. The following relationships hold:

Xcontr = span col Rn (A, B) = span col P(T ), T > 0.

As a corollary, the system is completely controllable:

Xcontr = Rn rank Rn (A, B) = n P(T ) > 0, T > 0,


the positive definiteness condition being in turn equivalent to all eigenvalues of the
controllability gramian P(T ) being positive for all T > 0.
such that
For any
x Xcontr , there exists w

)w
=
P(T
x. The following input

t)
A (T

(t) = B e
u

],
t [0, T
w,

. Furthermore, it
steers the system from the state zero at time zero, to the state
x at time T
is a minimal energy
can be shown that among all inputs which accomplish this transfer, u
R
input, i.e. its 2-norm (energy) k
uk22 = 0T u (t)u(t) dt, is the smallest possible. In particular
there holds

85/123

The infinite controllability gramian. If A is stable, i.e. all its eigenvalues are in the LHP
(left-half of the complex plane), the controllability gramian is defined for T :
P = P() =

e At BB e A

dt.

It turns out that this gramian satisfies the following linear matrix equation:
AP + PA + BB = 0,
which is known as a Lyapunov equation. Thus, if A is stable, the gramian for infinite time, can
be computed simply as the solution to the above linear matrix equation.
States which are the easiest/most difficult to reach. Suppose the system is completely
controllable, so the controllability gramian P(T ) Rnn is positive definite P(T ) > 0, T > 0.
Let its eigenvalue decomposition be P(T ) = VV (recall that symmetric matrices have an
orthonormal set of eigenvectors VV = I). Suppose that the eigenvalues are sorted in decreasing
order:

2
..

, 1 2 n 0.

.
n

86/123

The eigenvalue decomposition of the inverse gramian is P(T )1 = V1 V , where


1

1 =

1
2

..

, 0 1 1 1 .

1
2
n

.
1
n

The energy required to reach an arbitrary state x of unit norm (kxk = 1) is


1 2
1 2
1 2
E = x P(T )1 x = x V 1 V x =
w +
w + +
w
|{z}
|{z}
1 1
2 2
n n
w


1
1
,
,
1 n

because the fact that the state x is normalized x x = 1, implies that w is also normalized
w w = x V V x = x x = 1.
The energy is minimized, i.e. E =
w=

1
,
1

T

for
x=V

T

x = v1 ,

where v1 is the eigenvector associated to the eigenvalue 1 . Therefore, the state which is
the easiest to reach is the eigenvector corresponding to the largest eigenvalue.
The energy is maximized, i.e. E =
w=

1
,
n

T

for
x=V

T

x = vn ,

where vn is the eigenvector associated to the eigenvalue n . Therefore, the state which is the
most difficult to reach is the eigenvector corresponding to the smallest eigenvalue.

87/123

Example. Consider the system with the following matrices describing its dynamics


0
2

A=

1
3


, B=

0
1


.

Since the matrix A is in companion form therefore, its characteristic polynomial is


A (s) = s 2 + 3s + 2. The roots of the characteristic polynomial 1 = 1, 2 = 2, are the
eigenvalues of A and the poles of the system. To compute the controllability gramian we need
the matrix exponential of A:
e At =


|

1
1

{z

1
2

e At B =



e t


e 2t

}|


}|

{z

e t

2e t e 2t
2e t + 2e 2t

2
1

{z

1
1

V1


=

e t e 2t
e t + 2e 2t

2e t e 2t
2e t + 2e 2t



0
1


=

e t e 2t
e t + 2e 2t

e t e 2t
e t + 2e 2t


.

Thus the gramian is


) =
P(T

"
=

1
12

e At BB e A t dt =

Z
0

e t e 2t
e t + 2e 2t

12 e 2T + 23 e 3T 14 e 4T

e 3T + 21 e 2T + 12 e 4T

1
6

e t e 2t ,

e 3T + 12 e 2T + 12 e 4T

e t + 2e 2t

12 e 2T + 43 e 3T e 4T

#
.

88/123

dt =

Note that the gramian is indeed symmetric since the (1, 2) and (2, 1) entries are equal. Consider
the limit as T :
#
" 1
0
12
.
P() =
1
0
6


x1
; the energy required is
Suppose we allow plenty of time to reach a state
x=
x2
E =
x [P()]1
x=

x1

x2

12
0

0
6



x1
x2

= 12x21 + 6x22 .

2
2
To minimize the energy
 under
 the constraint x1 + x2 = 1 (normalized states), we
 need
 x1 = 0
0
1
and x2 = 1, i.e.,
x=
, while to maximize it x1 = 1 and x2 = 0, i.e.,
x=
. We
1
0


0
conclude that the state which is the easiest to reach is
x=
(i.e., the eigenvector
1
corresponding to the largest eigenvalue of P) because it requires the least amount of energy

1
(E = 6). On the other hand, the state which is the most difficult to reach is
x=
(i.e.,
0
the eigenvector corresponding to the smallest eigenvalue of P) because it requires the largest
amount of energy (E = 12). Any other state will require an" amount
# of energy which is in

between those two values. Take for example the state


x=

2
1

. The energy associated with

this state is E = 12 12 + 6 12 = 9 [6, 12].

89/123

Infinite controllability gramians


Earlier we showed that the EVD of the controllability gramian P(T ) allows us to classify states
according to their degree of controllability, i.e. according to the energy required to reach them.
P(T2 ) P(T1 ), for T2 T1 .

Claim: The controllability gramian satisfies


Recall the definition:
QR

QR0

i (Q R) 0, i,

i.e. Q R means that the matrix Q R is positive semi-definite.


Proof of the claim:
P(T2 )

Z
=

T2

e At BB e A t dt =

P(T1 ) + W

T1

e At BB e A t dt +

where

W=

T2

e At BB e A t dt

T1

T2
T1

e At B B e A t dt.
| {z } | {z }
H(t)

H (t)

Since H(t)H (t) is positive semi-definite by construction, W is the sum of positive semi-definite
quantities and therefore W 0. This is turn implies W = P(T2 ) P(T1 ) 0
P(T2 ) P(T1 ), T2 T1 .

90/123

Consequence
P(T1 ) P(T2 )

P(T1 )1 P(T2 )1

E(T1 ) E(T2 ), T1 T2 .

This shows that

the minimal energy required to reach a state x at time T decreases as T increases.

Therefore, if the system is stable (i.e. P(T ) is defined as T ), the smallest minimal
energy required to reach a state is attained at T = . The corresponding gramian is called:

infinite controllability gramian, and is denoted by P

91/123

We now state the following key result: assuming <(i (A)) < 0, the infinite gramian P satisfies
the following Lyapunov equation:
AP + PA + BB = 0
This is a linear matrix equation that needs to be solved for P (no integration necessary).
Proof: We will show that the controllability gramian for T = is the solution of the Lyapunov
equation.
AP + PA

Z
=

A
Z

e At BB e A

e At BB e A


dt A

 Z
dt +

Ae At BB e A

+ e At BB e A t A dt

Z
=
=
=

d h At
e BB e A t dt
dt
0


e At BB e A t
0

0 BB , where e At = 0 at t = , because A is stable.

This proves that AP + PA + BB = 0.

92/123




1
0
1
, B=
.
0 2
1
We wish to compute P(). We do this in two ways:
Example. Consider the system: A =

1. By definition
P(T )

Z
=
0

e t
e 2t

"
e t

e 2t

"
P() =

1
2
1
2

1
(1
3
1
(1
4

e 3T )

q 2q + 1

1
(1
2
1
(1
2

dt =

1
0

0
2



p
q

q
r


+

p
q

q
r


"

1
0

0
2

p p + 1

p=

1
1
1
, q= , r =
2
3
4

P = P() =

p
q

q
r

e 4T )


.

2r r + 1

"

1
1

q 2q + 1

e 2T )

1
3
1
4

2. Let us solve the Lyapunov equation for P, where P = P =




e 2T )

1
2
1
3

1
3
1
4

0
0

0
0

0
0

0
0

#
as before.

Note: In MATLAB, the infinite controllability gramian is obtained using: P = lyap(A, B*B).

93/123

The concept of observability .


Example.
y

RL

RC

?x1

?
x2

Given the dynamical system of an electrical circuit, we may be interested in answering the
following question: by observing the output y, can we find out what the states x1 and x2 are?
As we will see later, the answer to this problem does not depend on the input u.

T
Example. Consider the satellite example. The state is x = r , r , , . We may choose to
observe either the radius r (y= r ), the angle (y = ), the radial velocity r (y = r ), or any
r
two of them (e.g., y =
). Which of the above observations allow us to deduce the

remaining state variables?

94/123

Problem. Consider the dynamical system:

x(t)
= A x(t) +B u(t),
|{z} |{z} |{z}
nn n1

n1

y(t) = C x(t) +D u(t)


|{z} |{z} |{z}
p1

pn n1

Given the output y(t), t [0, T ), find the initial conditions x(0).
From the equation describing the dynamics of the system we have

x(t)
= A x(t) + B u(t)

x(t) = e At x(0) +

e A(t ) Bu( )d .

Substitute this expression into the output equation to get


y(t) = Ce

At

D(t ) + Ce

x(0) +

A(t )

u( )d = Ce

At

h(t )u( )d

x(0) +

{z

yzs (t)

Let

f(t) = yzi (t) = y(t) yzs (t)

f(t) = Ce At x(0)

Problem. Given f(t), t [0, T ), together with

and A,

find the initial condition x(0).

This implies the state for all time: x(t), t 0.

95/123

Evaluating f(t) = Ce At x(0) at t = 0, gives f(0) = Cx(0).

Evaluating the derivative at t = 0 gives f(0)


= CAx(0).
In general, the k th derivative f (k) (t) = CAk e At x(0), evaluated at t = 0, yields
f (k) (0) = CAk x(0), k = 0, 1, , n 1
The relationships give the following system of linear equations to be solved for x(0):

f(0)
f (1) (0)

f (2) (0)
..
.
(n1)
f
(0)
{z
|
np1




=


}

C
CA
CA2
..
.
CAn1
{z
npn

x(0) .
|{z}
n1
}

Similarly to the controllability matrix, we can define the observability matrix as

On (C, A) =

C
CA
CA2
..
.
CAn1

Rnpn

96/123

Next we define the subspace of unobservable states Xunobs as the kernel or nullspace of the
observability matrix

Xunobs = {x : On (C, A) x = 0} = ker On (C, A),


which is a linear space. The above system of equations can be solved for all x(0) if the
observability matrix is full rank (rank On (C, A) = n). In this case, we have that

system completely observable

Xunobs = {0}

rank On (C, A) = n .

Example. Consider the same RLC circuit above. The state-space matrices are
"
A=

RLL
0

0
R

#
, C=

CC

R1

, D=

1
.
RC

The observability matrix is then

O2 (C, A) =

1
RLL

R1

1
RC2 C

97/123

It follows that complete observability is lost if the observability matrix is rank deficient, i.e. its
determinant is zero:
det On (C, A) =

1
RC2 C

RL 1
1
=
L RC
RC

det On (C, A) = 0

1
RC C

RL
L

1
RL
=
.
RC C
L

In this case both complete observability and complete controllability are lost. The unobservable
space is


1
Xunobs = ker O2 (C, A) = span
.
RC
Check:

"

1
R
LL

R1

1
R2 C
C

1
RC


=

11
R
LL

1
RC C


=

0
0

In this case the solution of the set of linear equations O2 (C, A)x(0) = F, is not unique. More
precisely it is


1
x(0) +
, R.
RC
Therefore the initial state can only be determined up to an element of the nullspace of the
observability matrix.

98/123

Example. In the satellite example, consider the state-space matrices given by

0
3
A=
0
0

1
0
0
2

0
0
0
0

0

2
, C = 1
1
0
0

0
0

0
1

0
0

1
0
0
2

0
0
0
0


,

while the state and the output are:

r


r
r

x=
, y = .

Since

3
0
A2 =
0
6

0
1
2
0

0
0
0
0

2
0

0
, A3 = 3
6
0
4
0

0
2
,
4
0

the observability matrix is

99/123

CA

O4 (C, A) =
=
2

CA

CA3

1
0
0
0
3
0
0
6

0
0
1
0
0
2
1
0

0
1
0
0
0
0
0
0

0
0
0
1
2
0
0
4

R84

rank O4 (C, A) = 4.

Special case (a). In case we choose to measure only the angle (y = ), the matrix C2 is the
second row of the matrix C (in MATLAB notation, C2 = C(2, :)). The observability matrix is
composed of the even-numbered rows of O4 (C, A):

O4 (C2 , A) =

0
0
0
6

0
0
2
0

1
0
0
0

0
1

0
4

rank O4 (C2 , A) = 4.

Therefore, measuring only the angle still yields an observable system. This means that from
r , r .
measurements of the angle we can deduce the value of the remaining state variables ,
Special case (b). In case we choose to measure only the radius r (y = r ), the matrix C1 is the
first row of the matrix C (in MATLAB notation, C1 = C(1, :)), and hence the observability
matrix is composed of the odd-numbered rows of O4 (C, A):

100/123

1
0

O4 (C1 , A) =
3
0

0
1
0
1

0
0
0
0

0
0

2
0

rank O4 (C1 , A) = 3,

and consequently complete observability is lost. To further analyze the problem, we compute the
unobservable space by computing the kernel of the observability matrix:

1
0
3
0

0
1
0
1

0
0
0
0

0
0
2
0

x
y
z
t

0
0
0
0

x
y
3x + 2t
y

0
0
0
0

x = y = t = 0, z R

Xunobs

0
0

= span
1 .
0

Conclusion. Suppose that the initial condition is x(0) =

that any state of the form

x(0) +

0
0
1
0

r0
r0
0 +
0

r0
r0
0
0

. By observing r , we conclude

, could be the initial condition, where

is arbitrary, i.e. it cannot be determined from the measurements.

101/123

Recap: observability.
A state is unobservable if the corresponding output y(t) = 0, t 0, so

Xunobs = ker On (C, A) = {x : On (C, A)x = 0} .


Thus if the observability matrix is full rank,
rank On (C, A) = n

Xunobs = {0} system completely observable.

If rank On < n, no state is observable. Instead, the uncertainty in determining the initial state
reduces from n to n k where rank On (C, A) = k.
Remark. Recall that the system of linear equations
nm

Ax = b, A R

, bR , xR ,

can have either (i) no solution, (ii) one solution, or (iii) infinitely many solutions.
The general solution can be expressed as: x = xhomogeneous + xparticular ,

Axhomogeneous = 0.

If there exists only one solution then ker A = {0}.

102/123

Observability Gramian

Q(T ) =

e A t C Ce At dt

Q(T ) = Q (T ) 0.

The norm of the output y(t) = Ce At x(0), is


ky(t)k2 =

Z
0

y (t)y(t) dt = x (0)Q(T )x(0) = Ex(0) .

Therefore, we can classify states according to their degree of observability or observation energy
Ex = x Qx.
states in the span of the eigenvectors corresponding to large eigenvalues of Q produce
large observation energy
states in the span of the eigenvectors corresponding to small eigenvalues of Q produce
small observation energy
If the system is stable, i.e. R{i (A)} < 0, then limT Q(T ) = Q, exists and is called the
infinite observability gramian. Similarly to the contollability gramian, Q satisfies the Lyapunov
equation
A Q + QA + C C = 0

103/123

The duality principle

The degree of controllability of the pair (A, B), is the same as the degree of observability of the
pair (B , A ). This is called the duality principle, and follows from the identity:

R(A, B) =

B,

AB,

An1 B
=

B
B A
..
.

B (An1 )

= O(B , A ).

Hence
P(A, B) = Q(B , A )

104/123

Example. Consider the RLC circuit discussed earlier, with the following values of the
parameters: RL = 1, RC = 12 , L = 1, C = 1. The state-space matrices are

A=

1
0

0
2

2
4


, B=

1
2


, C=

, D=2

The observability matrix is



O2 (C, A) =

1
1

det O2 = 4 + 2 = 6 6= 0

ker O2 = {0}.

The system is completely observable. Since the eigenvalues of A are in the left-half plane
(1 = 1, 2 = 2), the system is stable so the infinite observability gramian exists and
satisfies A Q + QA + 
C C = 0. Since we know that it is a symmetric matrix, we can write it in
q r
the general form Q =
, where q, r , p R are to be determined:
r p


1
0

0
2




q
2r

q
r
r
2p

r
p


+

q q + 1
r 2r 2

2p 2p + 4


+
=
=
=

q
r
q
r
0
0
0

r
p



1
0

0
2


+

1
2

 
 
2r
1 2
+
=
2p
2
4

1 = 2q

q
3r = 2

4p = 4
p

=0

0
0

0
0

=
=
=

23
1

1
2

105/123

1
2
23

Q=

32

The energy associated to observing the state







1

23
1
1
2
1 0
x=
is
= 21 ;
0
0
32
1

x=

x=

0
1

1
1

is

is

the normalized energy is

1
2
32

23
1



1
2
32

23
1



x Qx
x x

1
6

0
1

1
1

= 1;

1
6

1
.
12

The observation energies of each of the above states can be compared to the corresponding
controllability energies:
P=

1
2
2
3

2
3

1
1
2

4
9

32

32
1
2


=

18
12

12
9


.






1
0
1
, x2 =
, x3 =
, is 18, 9, and
0
1
1
3, respectively. We conclude that, out of the three states, x3 is the most difficult to reach, as
well as the most difficult to observe. However, x2 is the most observable, while x1 is the most
reachable.

The energy for reaching the states x1 =

106/123

Review of Controllability and Observability


Observability
Controllability

Observability matrix

Controllability matrix:

Rn (A, B) =

n1

nn

(P = P

B, AB, , A

nnm

On (C, A) =

C
CA
CA2
..
.
CAn1

pnn
.
R

Controllability gramian:
T

Z
P(T ) =

Observability gramian
At

A t

BB e

dt R

0).
T

Q(T ) =

A t

C Ce

At

dt R

nn

(Q = Q

0).

Controllable space:

Xcontr = span col Rn (A, B) = span col P(T ).


Controllability energy: E = x P(T )1 x.

Unobservable space

Xunobs = ker On (C, A) = ker Q(T ).


Observability energy: E = x Q(T )x

Infinite controllability gramian. If R(i (A)) < 0,


P = P() exists and satisfies the
Lyapunov equation:

AP + PA + BB = 0.

Infinite observability gramian. If R(i (A)) < 0,


Q = Q() exists and satisfies the
Lyapunov equation:

A Q + QA + C C = 0.

107/123

Eliminating a state. Example. Consider the system

x 1
x 2

=
=
=

x1 + 3x2 + u
x1 2x2
x2

A=

1
1

3
2


, B=

1
0


, C=

To determine stability
we compute the roots of det(sI A) = s 2 + s + 1 = 0

s1,2 =

1i
2

. The system is stable since the poles are in the LHP. It is also completely

controllable because the controllability matrix


R2 (A, B) =

1
0

1
1

, is full rank.

Eliminate x1 :
x 2 = 2 x2 + 0 u,
|{z}
|{z}
A1

B1

y = 1 x2 .
|{z}
C1

This reduced system is stable but not controllable.


Eliminate x2 :
x 1 = 1 x1 + 1 u,
|{z}
|{z}
A2

B2

y = 0 x1 .
|{z}
C2

This reduced system is not stable but controllable.


Therefore, the question is:

How do we eliminate the states in general?

108/123

Basis change in the state space. Given the state space equations
x = Ax + Bu, y = Cx,
we perform a basis change in the state space:

x = Tx, T Rnn , det (T) 6= 0.


After multiplying the first equation by T on the left and inserting T1 T before each x, the new
equations are

T x
y

=
=

T1

TA
T x+T B u
C T1 T x

T A T1
x+T B u
| {z }
|{z}

C T1
x
| {z }

Therefore, the new state-space matrices are

=
=
=

TAT1 ,
.
TB,
CT1

109/123

The transfer functions of the two systems are the same, so the basis change does not affect the
input/output description of the system:

H(s)

=
=
=
=
=
=


1
sI A

C
B

1
CT1 sI TAT1
TB
1
CT1 sTT1 TAT1
TB


1
CT1 T (sI A) T1
TB
CT1 T (sI A)1 T1 TB
C (sI A)1 B = H(s).

Similarly, the impulse response does not depend on the transformation T:



At

h(t) = Ce At B = Ce
B = h(t).

The poles of the system, given by the eigenvalues of A or the roots of the denominator q(s) of
are
the transfer function, also remain invariant. This holds because the matrices A and A
similar, and hence their eigenvalues are the same.

110/123

State transformations
State transformations provide freedom in model reduction. Given a new state
x = Tx,
det T 6= 0, the state-space matrices and the gramians are transformed as follows

TAT1

TB

CT1

TPT

T QT1

Consequences. (a) The eigenvalues of A are invariant; thus the poles of the system (or the
characteristic frequencies) remain unchanged.
(b) The transfer function and the impulse response remain unchanged:

1

sI A

H(s)
=C
B

C (sI A)1 B = H(s)

At
h(t)
= Ce
B

Ce At B = h(t)

(c) Since the gramians are transformed by congruence, their eigenvalues are not preserved.
However, we notice that the eigenvalues of the product of the gramians PQ are invariant.

111/123

Example. Consider a basis change of the system discussed earlier, by defining the new state
variables as:
(

x1 = x1 + x2
.

x2 = x1 x2

The transformation is T =

1
1

1
1


.

The state-space matrices in the new basis are

TAT1

TB

CT1

1
2

1
1

1
2

1 1
7 3


The differential equations which govern the system are


x1

x 2

x
2 1

21
x2 + u

x
2 1

23
x2 + u

x
2 1

21
x2

112/123

Example. Recall the previous example:


x 1 = x1 + 3x2 + u, x 2 = x1 2x2 , y = x2 .
The infinite gramian P and its eigenvalue decomposition are:
5
2


P =

1
2


=




0.92 0.38
2.91
0
0.92 0.38
.
0.38 0.92
0 0.08
0.38
0.92
|
{z
}|
{z
}|
{z
}
V

The transformed system has the controllability gramian


=
P

1
2

2
5


=
|

0.92
0.38

{z

0.38
0.92





0.17
5.82

}|

{z

}|

0.92
0.38

{z

0.38
0.92


}

Notice that
= TPT
P
Conclusion: the ellipse corresponding to P is rotated and stretched with respect to the ellipse

corresponding to P.

113/123

ellipses of original and balanced system.

114/123

Eliminate the least controllable states


= DP
P
The transformation which achieves this is
T = V

In this new basis:

energy is

1
;
1

also

1
0
..
.
0
0
..
.
0
1

= e1 , is the state which is the easiest to control and the required

= en , is the most difficult state to reach with energy

1
.
n

Therefore,

we partition the state-space matrices conformally to the partitioning of the controllability


gramian:

..

=
P

A
=

k
k+1
..

11
A
21
A

12
A
22
A

=
, B

1
B
2
B

=[
, C

1
C

2
C

115/123

].

Thus the order k reduced system results:


A11 , B1 , C1

Eliminate the least observable states


= DQ
Q
The corresponding transformation is
T = W
Afterwards, we proceed as in the controllability case.

Proof.
h
i
W
W1 Q = WDQ W = WQW

=
=

W1 QW
T QT1

T = W .

116/123

Remarks
1

to I. After this
Notice that there is a transformation which transforms P
transformation, all states are equally controllable, therefore no reduction is possible based
on the controllability criterion.

= I:
Similarly for observability: there exists a transformation such that Q

Solution to the puzzle: consider PQ. This product is transformed by similarity:



Q
= (TPT ) T QT1 = T [PQ] T1 .
P
Consequently, the eigenvalues of the product PQ are invariant under state space
transformations. This motivates the search for a transformation T which simultaneously
diagonalizes P and Q. In this basis, the system will be in balanced representation: each
state ei , i = 1, , n, is equally controllable and observable.

117/123

Hankel Singular Values


i =

p
i (PQ), 1 n 0

Like in the case of matrices, they provide a trade-off between accuracy and complexity.
Balanced Representation.
+ Bu,

x = Ax
y = Cx
= Q.
Hence the states which are most difficult to
This representation is called balanced if P
reach are also the most difficult to observe.
Given
x = Ax + Bu, y = Cx
we seek T such that
"
TPT = T QT1 = =

#
..

.
n

Recall the example discussed earlier:

Remark. In MATLAB, the command balreal gives the balanced representation of a given
triple (A, B, C).

118/123

Reduction by balanced truncation


Abal = TAT1 =

A11
A21

Let T be a balancing transformation:

A12
A22

B1
B2

, Bbal = TB =

, Cbal = CT1 =

C1

C2

where Pbal = TPT = Qbal = T QT1 = , and

..

1
=

k
k+1
..


2

We will denote H(s) = C (sIn A)1 B and H(s)


= C1 (sIk A11 )1 B1 .
Properties: The reduced system ( A11 , B1 , C1 ), is stable:
|{z} |{z} |{z}

R (i (A11 )) < 0.

kk km pk

There exists an error bound for the H - norm of the transfer function of the error
system



H(j) H(j)
2 (k+1 + + n )

Remark. This is the only reduction method which (i) leads to a guaranteed stable reduced
system, and (ii) satisfies an apriori computable error bound.

119/123

Model Reduction by Projection. Given a dynamical system:

x(t)

|{z}

y(t)

|{z}

A x(t) +B u(t)
|{z} |{z}
nn n1

n1

pn n1

p1

perform a change of basis

(8)

C x(t) +D u(t)
|{z} |{z}

x = V
x . This leads to
|{z} |{z} |{z}
n1

nk n1

x
V
y

=
=

AV
x + Bu
CV
x + Du

(9)

Let W Rnk be such that W V = I. Multiplying the first equation by W gives


V
x
W
| {z }
I

W AV
x + W Bu

CV
x + Du

The reduced system is




=
=

x + Bu

A
x + Du

(10)

120/123

where

W AV
W B
CV
D

=
=
=
=

2 = V W V W = , i.e., is a
| {z }

If we denote the product VW by , we can check that

projection.
1

orthogonal projection onto the line x1 = x2 .


"
=

2
1

0
1

1
2

1
2

1
1

1
2
1
2

Check
e2 =

1
2

1
1

1
1



1
1

projection along the line x1 = 0: W =

arbitrary W such that W V = I is an oblique projection onto the span "of the #
1

columns of V along the span of the zeros of W . Example: n = 2, R2 , V =

2
1

121/123

If W = V, is an orthogonal projection.
.
....
.. ..
.. ..

span col V

..

.....
%
....
..

z- z
....
..
...
...

Figure:
projection
hz, z zi
= z Orthogonal
(z z) = z z
z |{z}
2 z = 0




If W = V, is an orthogonal1 projection.
=

1 0

1 0
1 0

hz, z zi = z (z z) = z z z 2 z = 0
|{z}

Modal approximation

the eigenvalue




 1 
Performining


1  decomposition
1 0of A gives A= VV , where
1 0 =

=
1 =
1
1 0

with 0 Re 1 Re 2 . . . Re n .
=

.
..

122/123

sectionModal approximation Performing the eigenvalue decomposition of A gives A = VV1 ,


where = diag [1 , 2 , , n ] with 0 Re 1 Re 2 Re n . Modal
approximation consists in retaining the k dominant poles.
A
C

CV

V1 B

From the last system, perform truncation in order to obtain the k th order approximant.

123/123

Вам также может понравиться