Optimal PDF

ECE1655 Optimal Control
Bruce Francis
Course notes, Revised September 7, 2010

Contents
1 Introduction 3
1.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 What We Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Theorems and Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Linear Algebra 9
2.1 Brief Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 The Jordan Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 The Transition Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.5 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.6 Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.7 Linear Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.8 Matrix Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.9 Invariant Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.10 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3 Calculus 36
3.1 Jacobians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2 Optimization over an Open Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.3 Optimizing a Quadratic Function with Equality Constraints . . . . . . . . . . . . . . 41
3.4 Optimization with Equality Constraints . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.5 Application: Sensor Placement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
I Classical Theories 57
4 Calculus of Variations 58
4.1 The Brachistochrone Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.2 The General Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.3 The Euler-Lagrange Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
1
CONTENTS 2
5 The Maximum Principle 67

5.1 The Double Integrator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.2 Two Special Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.3 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6 Dynamic Programming 72
6.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.2 The Hamilton-Jacobi-Bellman Equation . . . . . . . . . . . . . . . . . . . . . . . . . 74
6.3 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
II More Recent Theories 79
7 Introduction to Function Spaces 83

7.1 Hilbert Space and Banach Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
7.2 The Projection Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
7.3 Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
7.4 A Minimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
7.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
8 H2 Optimal Control 97
8.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
8.2 Lyapunov Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
8.3 Spectral Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
8.4 Riccati Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
8.5 The LQR Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
8.6 Solution of the H2 Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
8.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
9 H∞ Optimal Control 124

9.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
9.2 A Simple Feedback Loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
9.3 The Nehari Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
9.4 Hankel Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
9.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
Chapter 1
Introduction
1.1 History
Optimal control is the subject within control theory where the control signals, or the controllers
that generate them, optimize (minimize or maximize) some performance criterion. Let’s look at
some key developments, the first few being optimization problems that came before optimal control
but influenced its development.
1. The isoperimetric problem is to find a closed curve of fixed length and maximum enclosed
area: The solution is of course the circle. The study of this problem goes back to the ancient
Greeks. The problem has the form
max J(u) such that P (u) = l,
u
where u is a closed curve, J(u) denotes the enclosed area, P (u) denotes the perimeter of u,
and l is the given length. Thus J and P are functions that map curves to R.
2. The brachistochrone problem, formulated by Bernoulli in 1696, is to find the shape of
the curve down which a bead sliding from rest and accelerated by gravity will slip (without
friction) from one point to another in the least time. This has the form
min J(u),
u
where u denotes the curve and J(u) is the time it takes for the bead to slide from the start
point to the end point.
3. In the 1940s Wiener developed and solved an optimal filtering problem. A signal s(t) is
corrupted by additive noise n(t) to produce the measured signal y(t) = s(t) + n(t). In the
simplest formulation the signals are zero-mean, stationary random processes and the goal is
to find a filter with input y(t) and output ŝ(t), an estimate of s(t), such that the variance
of the error ŝ(t) − s(t) is minimized. There is an equivalent deterministic problem known as
H2 -filtering,
4. Also in the 1940s, R. S. Phillips and colleagues extended Wiener’s filtering problem to a control
problem. The variance of a tracking error was the object to be minimized. This work initiated
the development of H2 -optimal control, the linear-quadratic regulator (LQR) problem being
a special case.
3
CHAPTER 1. INTRODUCTION 4
5. The term dynamic programming was originally used in the 1940s by Richard Bellman
to describe a process of solving sequential decision problems; a typical problem is to find a
minimum-cost path through a graph. Bellman wrote an influential book, Dynamic Program-
ming, published in 1957. The procedure is widely used, for example in Viterbi decoding.
6. In 1962 the very influential book The Mathematical Theory of Optimal Processes, by L. S.
Pontryagin et al., appeared. The approach to constrained optimization problems is called the
maximum principle.
7. In the late 1970s Zames posed the question, are classical frequency-domain feedback design
methods (e.g., lead/lag compensation) optimal for some appropriate criterion? From this
came H∞ -optimal control.
8. Many interesting problems can be formulated as distance problems. As an example, given

a matrix A and a vector b, solve Ax = b, or, if it is not solvable, minimize the error kb − Axk;
that is, find y closest to b where y must lie in the set
V = {y; (∃x)y = Ax}.
This set (a subspace) equals the span of the columns of A. So the problem is to find the vector
in V that is closest to b. A more interesting problem, the Nehari problem, is to find the stable
LTI system that is closest to a given unstable LTI system.
1.2 What We Study

The course notes are divided into three parts:
Part I is a review, first of linear algebra and linear systems. These notes are taken from ECE557.
You are expected to know this material and we won’t do it in class. We begin with Chapter 3, the
part of calculus concerning optimization in Rn . We’ll review the method of Lagrange multipliers.
Part II presents three interesting, old topics. First, the calculus of variations, in particular the
brachistochrone problem. Second, the maximum principle, though just the rudiments. And finally,
dynamic programming.
Part III is the meat of the course, H2 and H∞ optimal control.
When we solve problems and get control laws that are optimal for a certain criterion, ask yourself
these questions: How can the control signal be implemented? What sensors would be required? Is
it a feedback controller? Is this controller robust to sensor noise and modeling errors?
1.3 Theorems and Proofs

The course is mathematical: The language is mathematical, the concepts are stated as definitions,
the results are stated as theorems, and proofs are rigorous. Engineering students aren’t used to
this. They take calculus and linear algebra etc. as undergraduates, they see epsilon and delta, but
they have difficulty working with them. One of the goals of this course is to improve that situation.
Please download and read Elements of Mathematical Style:
http://sites.google.com/site/brucefranciscontact/
Why is control theory so mathematical? Why is it in the theorem/proof format? The answer is,
for clarity. Results are written formally and proved so that they can be understood by everyone and
so that there can be no doubt about what they mean. The following (real) email correspondence
between a physicist and me emphasizes this point.
Me: On skimming through [your paper], I don’t see any formal theorem statements, with proofs.
So may I ask, are there mathematical results? That would help an outsider like me (of course, you
didn’t write the paper for outsiders).
Physicist: The paper is not written in the formal language of “theorems” and “proofs.” Certainly,
though, there are mathematical results reported that describe the physical properties of complex
modes. ... In general, though, in the physics literature papers are not written as mathematical
papers in the form of theorems, lemmas, and proofs.
Me: I wonder how physicists check each other’s work without explicit assumptions, mathematical
statements, and proofs. For example, your paper talks about “casual suggestions in the literature.”
Had they instead been rigorous statements, one could have checked them to be true or not. ...Your
model in the appendices involves a limit (homogeneous limit), so I guess something converges and
can be proved to converge in a precise way. Or is it that one doesn’t actually prove convergence
but instead verifies by experiment?
Physicist: This is a philosophical question really; for example one cannot prove Maxwell’s equations
mathematically. Mathematics is a tool, it cannot describe physical reality. So in the narrow area
of Electromagnetics, whether something is true or not boils down to whether it satisfies Maxwell’s
equations.
1.4 Problems
1. Give an example of a subset of R2 that is the graph of a function f : R −→ R. Give an
example of a subset of R2 that is not the graph of any function f : R −→ R.
2. Let U, V be two sets and let S be a subset of U × V . Write in logic notation the condition for
S to be the graph of a function from U to V .
3. Write the truth tables for (P ∧ Q) ⇒ R and P ∧ (Q ⇒ R).
4. Consider the differential equation ẋ = f (x), where x is a vector. Assume x = 0 is an

equilibrium point, i.e., f (0) = 0. We say the origin is a stable equilibrium point if
(∀ε > 0)(∃δ > 0)(∀x(0))kx(0)k < δ ⇒ (∀t ≥ 0)kx(t)k < ε.
In words, for every ε > 0 there exists δ > 0 such that if the state starts in the δ-ball, it will
remain forever in the ε-ball. Write in logic notation the definition that the origin is not stable.
Say this in words. Using your definition, prove that for ẋ = x the origin is not stable.
5. Consider the linear equation Ax = b, where A is a matrix, not assumed to be square, and
b, x are vectors. A necessary and sufficient condition for the equation to be solvable (for x to
exist) is

rank A b = rank A.
That is (necessity)

(∃x)Ax = b =⇒ rank A b = rank A
and (sufficiency)

rank A b = rank A =⇒ (∃x)Ax = b.
Write these two logic statements in contrapositive form.
6. Write a logic statement that there is a unique solution to the equation Ax = b, i.e., there
exists a solution and it is unique.
7. Consider the polynomial p(s) = s3 + a2 s2 + a1 s + a0 , with real coefficients. Consider the two
conditions: 1) The roots of p(s) have negative real parts; 2) The coefficients ai are all positive.
Which condition is necessary for the other? Is it sufficient?
8. The set of integers, 0, ±1, ±2, ±3, . . . , is denoted Z. One way to say an integer is even is that
it is a multiple of 2. Thus, if x is an integer, the following says it is even:
(∃y ∈ Z)x = 2y.
Write the following statements in logic notation, using only the set Z:
(a) Not every integer is even.

(b) Not every even integer is a multiple of 4.
(c) Every integer that is a multiple of 2 and a multiple of 3 is also a multiple of 6.
9. Prove rigorously that, if m and n are coprime integers (their greatest common divisor equals
1), then every integer that is a multiple of m and a multiple of n is also a multiple of mn.
10. If a single-input, single-output plant has a transfer function with a right half-plane zero, there’s
a maximum achievable gain margin. Make this into a theorem statement.
11. For each of the following statements, state if it is true or false.
(a) A discrete-time signal u[k] that converges to zero is bounded.

(b) A discrete-time signal u[k] converges to zero only if it is bounded.
(c) A continuous-time signal u(t) that converges to zero is bounded.
(d) If eAt converges to zero, then it’s bounded.
(e) The negation of
if u(t) converges to zero, then u(t) is bounded
is
if u(t) is bounded, then u(t) converges to zero.
(f) The negation of
Alice can pass ECE1655 only if she’s brilliant and she works hard
is
Alice can pass ECE1655 and either she’s not brilliant or she doesn’t work hard.
(g) The statement
Alice can pass ECE1655 only if she’s brilliant and she works hard
is equivalent to
if either Alice is not brilliant or she doesn’t work hard, then she can’t pass
ECE1655.
(h) A necessary condition for the origin of ẋ = Ax to be stable is that all eigenvalues of A
satisfy Re λ ≤ 0.
(i) A sufficient condition for the origin of ẋ = Ax to be stable is that all eigenvalues of A
satisfy Re λ ≤ 0.
(j) A necessary condition for the origin of ẋ = Ax to be asymptotically stable is that all
eigenvalues of A satisfy Re λ < 0.
(k) A sufficient condition for the origin of ẋ = Ax to be asymptotically stable is that all
eigenvalues of A satisfy Re λ < 0.
s−1
(l) Every bounded input to the system with transfer function 2 results in a bounded
s −1
output.
1
(m) Every nonzero bounded input to the system with transfer function results in an
s−1
unbounded output.
12. Many results in optimal control are in the form of providing a necessary condition for optimal-
ity. For example, the maximum principle and the Hamilton-Jacobi-Bellman equation. This
problem urges you to understand what necessary condition means.
(a) Suppose x is a variable in an optimization problem, P (x) is the proposition that x is

optimal, and Q(x) is some other proposition (it’s going to be the necessary condition).
Then
(∃x)P (x)
is true means there exists an optimal solution. Discuss the meanings of the following
statements. Is any the right one, i.e., the one we want in a theorem statement about
optimality?
(∃x)P (x) =⇒ Q(x)
(∃x)[P (x) =⇒ Q(x)]

(∃x)P (x), (∀x)[P (x) =⇒ Q(x)].
(b) Now consider the function f (u) = −u2 . Suppose P (x) means that f is maximized at the
point x and Q(x) means that the derivative of f equals zero at x. Which if any of the
three statements is true?
(c) Repeat for f (u) = −u3 .
13. The physicist’s final email is contradictory: Mathematics can’t describe physical reality, yet
Maxwell’s equations, which are mathematics, do. I believe physicists sometimes forget that
all of physics is based on models. The models of electromagnetics—the concepts of electric
and magnetic fields, charged particles and the forces on them, the differential equations and
boundary conditions, and so on—are abstract concepts and relationships that we use to help
us understand how we perceive phenomena. In fact all these concepts are mathematical: A
force is a vector-valued function of space and time, etc. We have used these models to build
technology, and in that sense the models have been wildly successful. But one, and especially
a control engineer, must not forget that models are not real.
Imagine a real battery connected to a small real DC motor sitting on some real lab bench
somewhere on the planet Earth. Write a brief essay that distinguishes between a model and
reality. In particular, answer this question: Does there exist a sequence of models, of ever
increasing complexity, that converges to reality?
Chapter 2
Linear Algebra
This is a background chapter on linear algebra: subspaces, matrix representations, linear matrix
equations, and invarianft subspaces. The material is from ECE557. If you took ECE410, then some
of this material will be new.
2.1 Brief Review

In this brief section we review these concepts/results: Rn , linear independence of a set of vectors,
span of a set of vectors, subspace, basis for a subspace, rank of a matrix, existence and uniqueness
of a solution to Ax = b where A is not necessarily square, inverse of a matrix, invertibility. If you
remember them (and I hope you do), skip to the next section.
The symbol Rn stands for the vector space of n-tuples, i.e., ordered lists of n real numbers.
A set of vectors {v1 , . . . , vk } in Rn is linearly independent if none is a linear combination of
the others. One way to check this is to write the equation
c1 v1 + · · · + ck vk = 0
and then try to solve for the ci ’s. The set is linearly independent iff the only solution is ci = 0 for
every i.
The span of {v1 , . . . , vk }, denoted Span{v1 , . . . , vk }, is the set of all linear combinations of these
vectors.
A subspace V of Rn is a subset of Rn that is also a vector space in its own right. This is true
iff these two conditions hold: If x, y are in V, then so is x + y; if x is in V and c is a scalar, then
cx is in V. Thus V is closed under the operations of addition and scalar multiplication. In R3 the
subspaces are the lines through the origin, the planes through the origin, the whole of R3 , and the
set consisting of only the zero vector.
A basis for a subspace is a set of linearly independent vectors whose span equals the subspace.
The number of elements in a basis is the dimension of the subspace.
The rank of a matrix is the dimension of the span of its columns. This can be proved to equal
the dimension of the span of its rows.
The equation Ax = b has a solution iff b belongs to the span of the columns of A, equivalently

rank A = rank A b .
9
CHAPTER 2. LINEAR ALGEBRA 10
When a solution exists, it is unique iff the columns of A are linearly independent, that is, the rank
of A equals its number of columns.
The inverse of a square matrix A is a matrix B such that BA = I. If this is true, then AB = I.
The inverse is unique and we write A−1 . A square matrix A is invertible iff its rank equals its
dimension (we say “A has full rank”); equivalently, its determinant is nonzero. The inverse equals
the adjoint divided by the determinant.
2.2 Eigenvalues and Eigenvectors

Now we turn to ẋ = Ax. The time evolution of x(t) can be understood from the eigenvalues and
eigenvectors of A—a beautiful connection between dynamics and algebra. Recall that the eigenvalue
equation is
Av = λv.
Here λ is a real or complex number and v is a nonzero real or complex vector; λ is an eigenvalue
and v a corresponding eigenvector. The eigenvalues of A are unique but the eigenvectors are not:
If v is an eigenvector, so is cv for any real number c 6= 0. The spectrum of A, denoted σ(A), is its
set of eigenvalues. The spectrum consists of n numbers, in general complex, and they are equal to
the zeros of the characteristic polynomial det(sI − A).
Example Consider two carts and a dashpot like this:
x1 x2
M1 D M2
Take D = 1, M1 = 1, M2 = 1/2, x3 = ẋ1 , x4 = ẋ2 . You can derive that the model is ẋ = Ax, where
 
0 0 1 0
 0 0 0 1 
A= .
 0 0 −1 1 
0 0 2 −2
The characteristic polynomial of A is s3 (s + 3), and therefore
σ(A) = {0, 0, 0, −3}.
The equation Av = λv says that the action of A on an eigenvector is very simple—just multi-
plication by the eigenvalue. Likewise, the motion of x(t) starting at an eigenvector is very simple.
Lemma 2.1 If x(0) is an eigenvector v of A and λ the corresponding eigenvalue, then x(t) = eλt v.
Thus x(t) is an eigenvector too for every t.
Proof The initial-value problem
ẋ = Ax, x(0) = v
has a unique solution—this is from differential equation theory. So all we have to do is show that
eλt v satisfies both the initial condition and the differential equation, for then eλt v must be the
solution x(t). The initial condition is easy:

eλt v = v.

t=0
And for the differential equation,

d λt
(e v) = eλt λv = eλt Av = A(eλt v).
dt

The result of the lemma extends to more than one eigenvalue. Let λ1 , . . . , λn be the eigenvalues
of A and let v1 , . . . , vn be corresponding eigenvectors. Suppose the initial state x(0) can be written
as a linear combination of the eigenvectors:
x(0) = c1 v1 + · · · + cn vn .
This is certainly possible for every x(0) if the eigenvectors are linearly independent. Then the
solution satisfies
x(t) = c1 eλ1 t v1 + · · · + cn eλn t vn .
This is called a modal expansion of x(t).
Example

−1 1 1 −1
A= , λ1 = 0, λ2 = −3, v1 = , v2 =
2 −2 1 2
Let’s say x(0) = (0, 1). The equation
x(0) = c1 v1 + c2 v2
is equivalent to
x(0) = V c,
where V is the 2 × 2 matrix with columns v1 , v2 and c is the vector (c1 , c2 ). Solving gives c1 = c2 =
1/3. So
1 1
x(t) = v1 + e−3t v2
3 3

The case of complex eigenvalues is only a little complicated. If λ1 is a complex eigenvalue, some
other, say λ2 , is its complex conjugate: λ2 = λ1 . The two eigenvectors, v1 and v2 , can be taken to
be complex conjugates too (easy proof). Then if x(0) is real and we solve
x(0) = c1 v1 + c2 v2 ,
we’ll find that c1 , c2 are complex conjugates as well. Thus the equation will look like
x(0) = c1 v1 + c1 v2 = 2< (c1 v1 ),
where < denotes real part.
Example

0 −1 1 1
A= , λ1 = j, λ2 = −j, v1 = , v2 =
1 0 −j j
Suppose x(0) = (0, 1). Then c1 = j/2, c2 = −j/2 and

λ1 t

jt 1 − sin t
x(t) = 2< c1 e v1 = < je = .
−j cos t
2.3 The Jordan Form

Now we turn to the structure theory of a matrix related to its eigenvalues. It’s convenient to
introduce a term, the kernel of a matrix A. Kernel is another name for nullspace. Thus Ker A is
the set of all vectors x such that Ax = 0; that is, Ker A is the solution space of the homogeneous
equation Ax = 0. Notice that the zero vector is always in the kernel. If A is square, then Ker A
is the zero subspace, and we write Ker A = 0, iff 0 is not an eigenvalue of A, equivalently, A is
invertible. If 0 is an eigenvalue, then Ker A equals the span of all the eigenvectors corresponding to
this eigenvalue; we say Ker A is the eigenspace corresponding to the eigenvalue 0. More generally,
if λ is an eigenvalue of A the corresponding eigenspace is the solution space of Av = λv, that is, of
(A − λI)v = 0, that is, Ker (A − λI).
Let’s begin with the simplest case, where A is 2 × 2 and has 2 distinct eigenvalues, λ1 , λ2 . You
can show (this is a good exercise) that there are then 2 linearly independent eigenvectors, say v1 , v2
(maybe complex vectors). The equations
Av1 = λ1 v1 , Av2 = λ2 v2
are equivalent to the matrix equation

λ1 0
A v1 v2 = v1 v2 ,
0 λ2
that is, AV = V AJF , where

V = v1 v2 , AJF = diag (λ1 , λ2 ).
The latter matrix is the Jordan form of A. It is unique up to reordering of the eigenvalues. The
mapping A 7−→ AJF = V −1 AV is called a similarity transformation. Example:

−1 1 1 −1 0 0
A= , V = , AJF = .
2 −2 1 2 0 −3
Corresponding to the eigenvalue λ1 = 0 is the eigenvector v1 = (1, 1), the first column of V . All
other eigenvectors corresponding to λ1 have the form cv1 , c 6= 0. We call the subspace spanned by
v1 the eigenspace corresponding to λ1 . Likewise, λ2 = −3 has a one-dimensional eigenspace.
These results extend from n = 2 to general n. Note that in the preceding result we didn’t
actually need distinctness of the eigenvalues — only linear independence of the eigenvectors.
Theorem 2.1 The Jordan form of A is diagonal, i.e., A is diagonalizable by similarity transforma-
tion, iff A has n linearly independent eigenvectors. A sufficient condition is n distinct eigenvalues.
The great thing about diagonalization is that the equation ẋ = Ax can be transformed via
w = V −1 x into ẇ = AJF w, that is, n decoupled equations:
ẇi = λi wi , i = 1, . . . , n.
The latter equations are trivial to solve:
wi (t) = eλi t wi (0), i = 1, . . . , n.
Now we look at how to construct the Jordan form when there are not n linearly independent
eigenvectors. We start where A has only 0 as an eigenvalue.
Nilpotent matrices
Consider
   
0 1 0 0 1 0
 0 0 0 ,  0 0 1 . (2.1)
0 0 0 0 0 0
For both of these matrices, σ(A) = {0, 0, 0}. For the first matrix, the eigenspace Ker A is two-
dimensional and for the second matrix, one-dimensional. These are examples of nilpotent matrices:
A is nilpotent if Ak = 0 for some k ≥ 1. The following statements are equivalent:
1. A is nilpotent.
2. All its eigs are 0.
3. Its characteristic polynomial is sn .
4. It is similar to a matrix of the form (2.1), where all elements are 0’s, except 0’s or 1’s on the
first diagonal above the main one. This is called the Jordan form of the nilpotent matrix.
Example Suppose A is 3 × 3 and A = 0. Then of course it’s already in Jordan form,

 
0 0 0
 0 0 0 
0 0 0
Example Here we do an example of transforming a nilpotent matrix to Jordan form. Take

 
1 1 0 0 0
 −1 −1 0 1 0 
 
A= 0
 0 0 0 0 .
 0 0 0 1 1 
0 0 0 −1 −1
The rank of A is 3 and hence the kernel has dimension 2. We can compute that
   
0 0 0 1 0 0 0 0 1 1
 0 0 0 0 1   0 0 0 −1 −1 
2 3  , A4 = 0.
   
A =  0 0 0 0 0 , A = 
 
 0 0 0 0 0 
 0 0 0 0 0   0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0
Take any vector v5 in Ker A4 = R5 that is not in Ker A3 , for example,
v5 = (0, 0, 0, 0, 1).
Then take
v4 = Av5 , v3 = Av4 , v2 = Av3 .
We get
v4 = (0, 0, 0, 1, −1) ∈ Ker A3 , 6∈ Ker A2
v3 = (0, 1, 0, 0, 0) ∈ Ker A2 , 6∈ Ker A

v2 = (1, −1, 0, 0, 0) ∈ Ker A.
Finally, take v1 ∈ Ker A, linearly independent of v2 , for example,
v1 = (0, 0, 1, 0, 0).
Assemble v1 , . . . , v5 into the columns of V . Then

 
0 0 0 0 0
 0 0 1 0 0 
V −1 AV = AJF = 
 
 0 0 0 1 0  .
 0 0 0 0 1 
0 0 0 0 0
This is block diagonal, like this:
0 0 0 0 0
 
 0 0 1 0 0 
 
 0 0 0 1 0 .
 
 0 0 0 0 1 
0 0 0 0 0
In general, the Jordan form of a nilpotent matrix has 0 in each entry except possibly in the first
diagonal above the main diagonal which may have some 1s.
A nilpotent matrix has only the eigenvalue 0. Now consider a matrix A that has only one
eigenvalue, λ, i.e.,
det(sI − A) = (s − λ)n .
To simplify notation, suppose n = 3. Letting r = s − λ, we have
det[rI − (A − λI)] = r3 ,
i.e., A − λI has only the zero eigenvalue, and hence A − λI =: N , a nilpotent matrix. So the Jordan
form of N must look like
 
0 ? 0
 0 0 ? ,
0 0 0
where each star can be 0 or 1, and hence the Jordan form of A is

 
λ ? 0
 0 λ ? , (2.2)
0 0 λ
To recap, if A has just one eigenvalue, λ, then its Jordan form is λI + N , where N is a nilpotent
matrix in Jordan form.
An extension of this analysis results in the Jordan form in general. Suppose A is n × n and
λ1 , . . . , λp are the distinct eigenvalues of A and m1 , . . . , mp are their multiplicities; that is, the
characteristic polynomial is
det(sI − A) = (s − λ1 )m1 · · · (s − λp )mp .
Then A is similar to
 
A1
AJF = 
 .. ,

.
Ap
where Ai is mi × mi and it has only the eigenvalue λi . Thus Ai has the form λi I + Ni , where Ni is
a nilpotent matrix in Jordan form. Example:
 
0 0 1 0
 0 0 0 1 
A=  0 0 −1

1 
0 0 2 −2
As we saw, the spectrum is σ(A) = {0, 0, 0, −3}. Thus the Jordan form must be of the form
 
0 ? 0 0
 0 0 ? 0 
AJF =  0 0 0
.
0 
0 0 0 −3
Since A has rank 2, so does AJF . Thus only one of the stars is 1. Either is possible, for example,
 
0 0 0 0
 0 0 1 0 
AJF =  0
.
0 0 0 
0 0 0 −3
This has two “Jordan blocks”:
 
0 0 0
A1 0
AJF = , A1 =  0 0 1  , A2 = −3.
0 A2
0 0 0
2.4 The Transition Matrix

For a square matrix M , the exponential eM is defined as
1 1
eM := I + M + M 2 + M 3 + · · · .
2! 3!
The matrix eM is not the same as the component-wise exponential of M . Facts:
1. eM is invertible for every M , and (eM )−1 = e−M .
2. eM +N = eM eN if M and N commute, i.e., M N = N M .
The matrix function t 7−→ etA : R → Rn×n is then defined and is called the transition matrix
associated with A. It has the properties
1. etA |t=0 = I
2. etA and A commute.
3. et(A+B) = etA etB iff A and B commute.
d tA
4. e = AetA = etA A.
dt
Moreover, the solution of
ẋ = Ax, x(0) = x0
is x(t) = etA x0 . So etA maps the state at time 0 to the state at time t. In fact, it maps the state at
any time t0 to the state at time t0 + t.
On computing the transition matrix

via the Jordan form If one can compute the Jordan form of A, then etA can be written in closed
form, as follows. The equation
AV = V AJF
implies
A2 V = AV AJF = V A2JF .
Continuing in this way gives
Ak V = V AkJF ,
and then
eAt V = V eAJF t ,
so finally
eAt = V eAJF t V −1 .
The matrix exponential eAJF t is easy to write down. For example, suppose there’s just one eigen-
value, so AJF = λI + N , N nilpotent, n × n. Then
eAJF t = eλt eN t

2 n−1

λt 2t n−1 t
= e I + Nt + N + ··· + N .
2! (n − 1)!
via Laplace transforms Taking Laplace transforms of
ẋ = Ax, x(0) = x0
gives
sX(s) − x0 = AX(s).
This yields
X(s) = (sI − A)−1 x0 .
Comparing
x(t) = etA x0 , X(s) = (sI − A)−1 x0
shows that etA , (sI − A)−1 are Laplace transform pairs. So one can get etA by finding the matrix
(sI − A)−1 and then taking the inverse Laplace transform of each element.
2.5 Stability
The concept of stability is fundamental in control engineering. Here we look at the scenario where
the system has no input, but its state has been perturbed and we want to know if the system will
recover.
The maglev example is a good one to illustrate this point. Suppose a feedback controller has
been designed to balance the ball’s position at 1 cm below the magnet. Suppose if the ball is placed
at precisely 1 cm it will stay there; that is, the 1 cm location is a closed-loop equilibrium point.
Finally, suppose there is a temporary wind gust that moves the ball away from the 1 cm position.
The stability questions are, will the ball move back to the 1 cm location; if not, will it at least stay
near that location?
So consider
ẋ = Ax.
Obviously if x(0) = 0, then x(t) = 0 for all t. We say the origin is an equilibrium point—if you
start there, you stay there. Equilibrium points can be stable or not. While there are more elaborate
and formal definitions of stability for the above homogeneous system, we choose the following two:
The origin is asymptotically stable if x(t) −→ 0 as t −→ ∞ for all x(0). The origin is stable
if x(t) remains bounded as t −→ ∞ for all x(0). Since x(t) = eAt x(0), the origin is asymptotically
stable iff every element of the matrix eAt converges to zero, and is stable iff every element of the
matrix eAt remains bounded as t −→ ∞. Of course, asymptotic stability implies stability.
Asymptotic stability is relatively easy to characterize. Using the Jordan form, one can prove
this very important result, where < denotes “real part”:
Theorem 2.2 The origin is asymptotically stable iff the eigenvalues of A all satisfy < λ < 0.
Let’s say the matrix A is stable if its eigenvalues satisfy < λ < 0. Then the origin is asymptot-
ically stable iff A is stable.
Now we turn to the more subtle property of stability. We’ll do some examples, and we may as
well have A in Jordan form.
Consider the nilpotent matrix

0 0
A=N = .
0 0
Obviously, x(t) = x(0) for all t and so the origin is stable. By contrast, consider

0 1
A=N = .
0 0
Then
eN t = I + tN,
which is unbounded and so the origin is not stable. This example extends to the n × n case: If A
is nilpotent, the origin is stable iff A = 0.
Here’s the test for stability in general in terms of the Jordan form of A:
 
A1
AJF = 
 .. .

.
Ap
Recall that each Ai has just one eigenvalue, λi , and that Ai = λi I + Ni , where Ni is a nilpotent
matrix in Jordan form.
Theorem 2.3 The origin is stable iff the eigenvalues of A all satisfy < λ ≤ 0 and for any eigenvalue
with < λi = 0, the nilpotent matrix Ni is zero, i.e., Ai is diagonal.
Here’s an example with complex eigenvalues:

0 −1 j 0
A= , AJF = .
1 0 0 −j
The origin is stable since there are two 1 × 1 Jordan blocks. Now consider
 
0 −1 1 0
 1 0 0 1 
A= .
 0 0 0 −1 
0 0 1 0
The eigenvalues are j, j, −j, −j and so the Jordan form must look like
 
j ? 0 0
 0 j 0 0 
AJF =  .
 0 0 −j ? 
0 0 0 −j
Since the rank of A − jI equals 3, the upper star is 1; since the rank of A + jI equals 3, the lower
star is 1. Thus
 
j 1 0 0
 0 j 0 0 
AJF =  0 0 −j
.
1 
0 0 0 −j
Since the Jordan blocks are not diagonal, the origin is not stable.
Example Consider the cart-spring-damper system

y
D
K
The equation is
M ÿ + Dẏ + Ky = 0.
Defining x = (y, ẏ), we have ẋ = Ax with

0 1
A= .
−K/M −D/M
Assume M > 0 and K, D ≥ 0. If D = K = 0, the eigenvalues are {0, 0} and A is a nilpotent

matrix in Jordan form. The origin is an unstable equilibrium. If only D = 0 or K = 0 but not
both, the origin is stable but not asymptotically stable. And if both D, K are nonzero, the origin
is asymptotically stable.
Example Two points move on the line R. The positions of the points are x1 , x2 . They move toward
each other according to the control laws
ẋ1 = x2 − x1 , ẋ2 = x1 − x2 .
Thus the state is x = (x1 , x2 ) and the state equation is

−1 1
ẋ = Ax, A = .
1 −1
The eigenvalues are λ1 = 0, λ2 = −2, so the origin is stable but not asymptotically stable. Obviously,
the two points tend toward each other; that is, the state x(t) tends toward the subspace
V = {x : x1 = x2 }.
This is the eigenspace for the zero eigenvalue. To see this convergence, write the initial condition
as a linear combination of eigenvectors:

1 −1
x(0) = c1 v1 + c2 v2 , v1 = , v2 = .
1 1
Then
x(t) = c1 eλ1 t v1 + c2 eλ2 t v2 = c1 v1 + c2 e−2t v2 → c1 v1 .
So x1 (t) and x2 (t) both converge to c1 , the same point.
2.6 Subspaces
Let X = Rn and let V, W be subspaces of X . Then V + W denotes the set
{v + w : v ∈ V, w ∈ W},
and it is a subspace of X . The set union V ∪ W is not a subspace in general unless one is contained
in the other. The intersection V ∩ W is however a subspace. As an example:
X = R3 , V a line, W a plane.
Then V + W = R3 if V does not lie in W. If V ⊂ W, then of course V + W = W.1

It is a fact that
dim(V + W) = dim(V) + dim(W) − dim(V ∩ W).
For example, think of V, W as two planes in R3 that intersect in a line. Then the dimension equation
evaluates to
3 = 2 + 2 − 1.
Two subspaces V, W are independent if V ∩ W = 0. This is not the same as being orthogonal.
For example two lines in R2 are independent iff they are not colinear (i.e., the angle between them
is not 0), while they are orthogonal iff the angle is 90◦ .
Every vector x in V + W can be written as
x = v + w, v ∈ V, w ∈ W.
If V, W are independent, then v, w are unique. Think of v as the component of x in V and w as its
component in W. Let’s prove uniqueness. Suppose
x = v + w = v 1 + w1 .
Then
v − v1 = w1 − w.
The left-hand side is in V and the right-hand side in W. Since the intersection of these two subspaces
is zero, both sides equal 0.
Clearly, V, W are independent iff
dim(V + W) = dim(V) + dim(W).
Three subspaces U, V, W are independent if U, V + W are independent, V, U + W are indepen-

dent, and W, U + V are independent. This is not the same as being pairwise independent. As an
example, let U, V, W be 1-dimensional subspaces of R3 , i.e., three lines. When are they independent?
Pairwise independent?
Every vector x in U + V + W can be written as
x = u + v + w, u ∈ U, v ∈ V, w ∈ W.
If U, V, W are independent, then u, v, w are unique. Also, U, V, W are independent iff
dim(U + V + W) = dim(U) + dim(V) + dim(W).
If V, W are independent subspaces, we write their sum as V ⊕ W. This is called a direct sum.
Likewise for more than two.
Let’s finish this section with a handy fact: Every subspace has an independent complement, i.e.,
V⊂X =⇒ (∃W ⊂ X ) X = V ⊕ W.
Think of X as R3 and V as a plane. Then W can be any line not in the plane.
1
In this chapter when we speak of lines we mean lines through 0. Similarly for planes.
2.7 Linear Transformations

We now introduce linear transformations. The important point is that a linear transformation is
not the same as a matrix, but every linear transformation has a matrix representation once you
choose a basis.
Let X = Rn and Y = Rp . A linear function A : X → Y defines a linear transformation (LT);
X is called its domain and Y its co-domain. Thus
A(x1 + x2 ) = Ax1 + Ax2 , x1 , x2 ∈ X
A(ax) = aAx, a ∈ R, x ∈ X .
It is an important fact that an LT is uniquely determined by its action on a basis. That is, if
A : X → Y is an LT and if {e1 , . . . , en } is a basis for X , then if we know the vectors Aei , we can
compute Ax for every x ∈ X , by linearity.
Example For us, the most important example is an LT generated by a matrix. Let A ∈ Rm×n .
For each vector x in Rn , Ax is a vector in Rm . The mapping x 7→ Ax is an LT A : Rn → Rm .
Linearity is easy to check.
Example Take a vector in the plane and rotate it counterclockwise by 90◦ . This defines an LT
A : R2 → R2 . Note that A is not given as a matrix; it’s given by its domain, its co-domain, and its
action on vectors. If we take a vector to be represented by its Cartesian coordinates, x = (x1 , x2 ),
then we’ve chosen a basis for R2 . In that case A maps x = (x1 , x2 ) to Ax = (−x2 , x1 ), and so
there’s an associated rotation matrix

0 −1
.
1 0
We’ll return to matrix representation later.
Example Let X = Rn and let {e1 , . . . , en } be a basis. Every vector x in X has a unique expansion
x = a1 e1 + · · · + an en , ai ∈ R.
Let a denote the vector (a1 , . . . , an ), the n-tuple of coordinates of x with respect to the basis.
The function x 7−→ a defines an LT Q : X → Rn . The equation
x = a1 e1 + · · · + an en
can be written compactly as x = Ea, where E is the matrix with columns e1 , . . . , en and a is the
vector with components a1 , . . . , an . Therefore a = E −1 x and so Qx = E −1 x, that is, the action of
Q is to multiply by the matrix E −1 .
For example, let X = R2 . Take the natural basis

1 0
e1 = , e2 = .
0 1
In this case E = I and Qx = x. If the basis instead is

1 −1
e1 = , e2 = ,
1 1
then

1 −1
E=
1 1
and Qx = E −1 x.
Every LT on finite-dimensional vector spaces has a matrix representation. Let’s do this very
important construction carefully. Let A be an LT X → Y,
X = Rn , basis {e1 , . . . , en }; Y = Rp , basis {f1 , . . . , fp }.
Bring in the coordinate LTs:
Q : X → Rn , R : Y → Rp .
So now we have the setup

A
X Y
Q R
Rn Rp
The left downward arrow gives us the n-tuple, say a, that represents a vector x in the basis
{e1 , . . . , en }. The right downward arrow gives us the p-tuple, say b, that represents a vector y
in the basis {f1 , . . . , fn }. It’s possible to add a fourth LT to complete the square:
A
X Y
Q R
Rn Rp
M
This is called a commutative diagram. The object M in the diagram is the matrix representation
of A with respect to these two bases. Notice that the bottom arrow represents the LT generated
by the matrix M ; we write M in the diagram for simplicity, but you should understand that really
the object is an LT. The matrix M is the p × n matrix that makes the diagram commute, that is,
for every x ∈ X
M a = b, where a = Qx, b = RAx.
In particular, take x = ei , the ith basis vector in X . Then a is the n-vector with 1 in the ith entry
and 0 otherwise. So M a equals the ith column of the matrix M . Thus, we have the following recipe
for constructing the matrix M :
1. Take the 1st basis vector e1 of X .
2. Apply the LT A to get Ae1 .
3. Find b, the coordinate vector of Ae1 in the basis for Y.

4. Enter this b as column 1 of M .
5. Repeat for the other columns.
Recall that Q is the LT generated by E −1 , where the columns of E are the basis in the domain of
A. Likewise, R is the LT generated by F −1 , where the columns of F are the basis in the co-domain
of A. Thus the equation M a = b reads
M E −1 x = F −1 Ax. (2.3)
Example Let A : R2 → R2 be the LT that rotates a vector counterclockwise by 90◦ . Let’s first
take the standard bases: e1 = (1, 0), e2 = (0, 1) for the domain and f1 = (1, 0), f2 = (0, 1) for the
co-domain. Following the steps we first apply A to e1 , that is, we rotate e1 counterclockwise by
90◦ ; the result is Ae1 = (0, 1). Then we express this vector in the basis {f1 , f2 }:
Ae1 = 0 × f1 + 1 × f2 .
Thus the first column of M is (0, 1), the vector of coefficients. Now for the second column, rotate
e2 to get (−1, 0) and represent this in the basis {f1 , f2 }:
Ae2 = −1 × f1 + 0 × f2 .
So the second column of M is (−1, 0). Thus

0 −1
M= .
1 0
Suppose we had different bases:
e1 = (1, 1), e2 = (−1, 2), f1 = (1, 2), f2 = (1, 0).
Apply the recipe again. Get Ae1 = (−1, 1). Expand it in the basis {f1 , f2 }:
1 3
(−1, 1) = f1 − f2 .
2 2
Get Ae2 = (−2, −1). Expand it in the basis {f1 , f2 }:
1 3
(−2, −1) = − f1 − f2 .
2 2
Thus
1
− 12
" #
2
M= .
− 23 − 32
Example Let A ∈ Rm×n and let A : Rn −→ Rm be the generated LT. It is easy to check that
A itself is then the matrix representation of A with respect to the standard bases. Let’s do it.
Let {e1 , . . . , en } be the standard basis on Rn and {f1 , . . . , fm } the standard basis on Rm . Then
Ae1 = Ae1 equals the first column, (a11 , a21 , . . . , am1 ), of A. This column can be written as
a11 f1 + · · · + am1 fm ,
and hence (a11 , a21 , . . . , am1 ) is the first column of the matrix representation of A.
Suppose instead that we have general bases, {e1 , . . . , en } on Rn and {f1 , . . . , fm } on Rm . Form
the matrices E and F from these basis vectors. From (2.3) we get that the matrix representation
M with respect to these bases satisfies
M E −1 = F −1 A,
or equivalently
AE = F M.
A very interesting special case of this is where A is square and the same basis {e1 , . . . , en } is
taken for both the domain and co-domain. Then
AE = EM,
or M = E −1 AE; the matrix M is a similarity transformation of the given matrix A.

Finally, suppose we start with a square A and take the basis {v1 , . . . , vn } of generalized eigen-
vectors. The new matrix representation is our familiar Jordan form AJF = V −1 AV . Thus the two
matrices A and AJF represent the same LT: A in the given standard basis and AJF in the basis of
generalized eigenvectors.
An LT has two important associated subspaces. Let A : X → Y be an LT. The kernel (or
nullspace) of A is the subspace of X on which A is zero:
Ker A := {x : Ax = 0}.
The LT A is said to be one-to-one if Ker A = 0, equivalently, the homogeneous equation Ax = 0

has only the trivial solution x = 0. The image (or range space) of A is the subspace of Y that A
can reach:
Im A := {y : (∃x ∈ X )y = Ax}.
We say A is onto if Im A = Y, equivalently, the equation Ax = y has a solution x for every y.

Whether A is one-to-one or onto (or both) can be easily checked by examining any matrix
representation A:
A is one-to-one ⇐⇒ A has full column rank;
A is onto ⇐⇒ A has full row rank.

If A is a matrix, we will write Im A for the image of the generated LT—it’s the column span of
the matrix; and we’ll write Ker A for the kernel of the LT.
Example Let A : R3 −→ R3 map a vector to its projection on the horizontal plane. Then the kernel
equals the vertical axis, the image equals the horizontal plane, A is neither onto nor one-to-one,
and its matrix with respect to the standard basis is
 
1 0 0
 0 1 0 .
0 0 0
We could modify the co-domain to have A : R3 −→ R2 , again mapping a vector to its projection
on the horizontal plane. Then the kernel equals the vertical axis, the image equals the horizontal
plane, A is onto but not one-to-one, and its matrix with respect to the standard basis is

1 0 0
.
0 1 0
Example Let V ⊂ X (think of V as a plane in 3-dimensional space X ). Define the function

V : V → X , Vx = x. This is an LT called the insertion LT. Clearly V is one-to-one and
Im V = V. Suppose we have a basis for V,
{e1 , . . . , ek },
and we extend it to get a basis for X ,
{e1 , . . . , ek , . . . , en }.
Then the matrix rep. of V is

Ik
V = .
0
Clearly, rank V = k.
Example Let X be 3-dimensional space, V a plane (2-dimensional subspace), and W a line not in
V. Then V, W are independent subspaces and
X = V ⊕ W.
Every x in X can be written x = v +w for unique v in V and w in W. Define the function P : X → V

mapping x to v. This is an LT called the natural projection onto V. Check that
Im P = V, Ker P = W.
Suppose {e1 , e2 } is a basis for V, {e3 } a basis for W. The induced matrix representation is

1 0 0
P = .
0 1 0

Example Let A : X → Y be an LT. Its kernel, Ker A, is a subspace of X ; let {ek+1 , . . . , en } be a

basis for Ker A and extend it to get a basis for X :
{e1 , . . . , ek , . . . , en } for X .
Then
{Ae1 , . . . , Aek }
is a basis for Im A. Extend it to get a basis for Y:
{Ae1 , . . . , Aek , fk+1 , . . . , fp }.
Then the matrix representation of A is

Ik 0
A= .
0 0

2.8 Matrix Equations

We already reviewed the linear equation
Ax = b, A ∈ Rn×m , x ∈ Rm , b ∈ Rn .
The equation is another way of saying b is a linear combination of the columns of A. Thus the
equation has a solution iff b ∈ column span of A, i.e., b ∈ ImA. Then the solution is unique iff rank
A = m, i.e., Ker A = 0.
These results extend to the matrix equation
AX = B, A ∈ Rn×m , X ∈ Rm×p , B ∈ Rn×p
In this section we study this and similar equations. We could work with LTs but we’ll use matrices
instead.
The first equation is AX = I. Such an X is called a right-inverse of A.
Lemma 2.2 A ∈ Rn×m has a right-inverse iff it’s onto, i.e. the rank of A equals n.
Proof (=⇒) If AX = I, then, for every y ∈ Rn ,
AXy = y.
Thus for every y ∈ Rn , there exists x ∈ Rm such that Ax = y. Thus A is onto.

(⇐=) Let {f1 , . . . , fn } be the standard basis for Rn . Since A is onto
(∀i)(∃xi ∈ Rm )fi = Axi .
Now define X to be the matrix whose ith column is xi , i.e., via Xfi = xi . Then AXfi = fi . This
implies AX = I.
The second equation is the dual situation XA = I. Obviously, such an X is a left-inverse.

Lemma 2.3 A ∈ Rn×m has a left-inverse iff it’s one-to-one, i.e., A has rank m.
Lemma 2.4 1. There exists X such that AX = B iff Im B ⊂ Im A, that is,

rank A = rank A B .
2. There exists X such that XA = B iff Ker A ⊂ Ker B., that is,

A
rank A = rank .
B
2.9 Invariant Subspaces

Example Let

1 1
A=
2 2
and let A: R2 → R2 be the generated LT. Clearly, Ker A is the 1-dimensional subspace spanned
1
by . Also,
−1
x ∈ Ker A ⇒ Ax = 0 ∈ Ker A,
or equivalently,
AKer A ⊂ Ker A.
In general, if A : X → X is an LT, a subspace V ⊂ X is A-invariant if AV ⊂ V. The zero

subspace, X itself, Ker A, and Im A are all A-invariant. Now Ker A is the eigenspace for the zero
eigenvalue, assuming λ = 0 is an eigenvalue (as in the example above).
More generally, suppose λ is an eigenvalue of A. Assume λ ∈ R. Then Ax = λx for some x 6= 0.
Then V = Span {x} is A-invariant. So is the eigenspace
{x : Ax = λx} = {x : (A − λI)x = 0} = Ker (A − λI).
Let V be an A-invariant subspace. Take a basis for V,
{e1 , . . . , ek },
and extend it to a basis for X :
{e1 , . . . , ek , . . . , en }.
Then the matrix representation of A has the form

A11 A12
A= .
0 A22
Notice that the lower-left block of A equals zero; this is because V is A-invariant.
Example Let X = R3 , let V be the (x1 , x2 )-plane, and let A : X → X be the LT that rotates a
vector 90◦ about the x3 -axis using the right-hand rule. Thus V is A-invariant. Let us take the bases
   
1 0
e1 =  0  , e2 =  1  for V
0 0
 
1
e1 , e2 , e3 =  1  for X .
1
The matrix representation of A with respect to the latter basis is
 
0 −1 −2
A= 1 0 0 .
0 0 1
So, in particular, the restriction of A to V is represented by the rotation matrix

0 −1
A11 = .
1 0
Finally, let A be an n × n matrix. Suppose V is an n × k matrix. Then Im V is a subspace of Rn .

How can we know if this subspace is invariant under A, or more precisely, under the LT generated
by A? The answer is this:
Lemma 2.5 The subspace Im V is A-invariant iff the linear equation AV = V A1 has a solution
A1 .
Proof If AV = V A1 , then Im AV ⊂ Im V , that is, A Im V ⊂ Im V , which says Im V is A-invariant.

Conversely, if Im AV ⊂ Im V , then the equation AV = V A1 is solvable, by Lemma 2.4. .
2.10 Problems
1. Are the following vectors linearly independent?
v1 = (1, 1, 2, 0), v2 = (1, 0, 2, −2), v3 = (−1, 2, −2, 6).
2. Continuing with the same vectors, find a basis for Span {v1 , v2 , v3 }.
3. What kind of geometric object is {x : Ax = b} when A ∈ Rm×n ? That is, is it a sphere, a

point—what?
4. Show that eA+B = eA eB does not imply that A and B commute, but e(A+B)t = eAt eBt does.
5. (a) Let A be an 8 × 8 real matrix with eigenvalues
2, 2, −3, −3, −3, 8, 4, 4.
Assume
rank(A − 2I) = 7, rank(A + 3I) = 6, rank(A − 4I) = 6.
Write down the Jordan form of A.

(b) The matrix
 
1 0 0 1
 1 0 0 1 
A=
 1

0 0 1 
−1 0 0 −1
is nilpotent. Write down its Jordan form.
6. Take
 
0 0 1 0
 0 0 0 1 
A= .
 0 0 −1 1 
0 0 2 −2
Show that the matrix V constructed as follows satisfies V −1 AV = AJF :
Select v3 in Ker A2 but not in Ker A.

Set v2 = Av3 .
Select v1 in Ker A such that {v1 , v2 } is linearly independent.
Select an eigenvector v4 corresponding to the eigenvalue −3.
Set V = [v1 v2 v3 v4 ].
(The general construction of the basis for the Jordan form is along these lines.)
7. Let
 
0 1 0 0
 0 0 1 0 
A=
 0
.
0 0 1 
−2 1 0 2
Write down the Jordan form of A.
8. Consider

σ ω
A= ,
−ω σ
where σ and ω 6= 0 are real. Find the Jordan form and the transition matrix.
9. In the previous problem, we saw that when

σ ω
A=
−ω σ
its transition matrix is easy to write down. This problem demonstrates that a matrix with
distinct complex eigenvalues can be transformed into the above form using a nonsingular
transformation. Let

−1 −4
A= .
1 −1
Determine the eigenvalues and eigenvectors of A, noting that they form complex conjugate
pairs. Let the first eigenvalue be written as a + jb with the corresponding eigenvector v1 + jv2 .
Take v1 and v2 as the columns of a matrix V . Find V −1 AV .
10. Consider the homogeneous state equation ẋ = Ax with

3 1
A=
2 2
and x0 = (3, 2). Find a modal expansion of x(t).
11. Show that the origin is asymptotically stable for ẋ = Ax iff all poles of every element of
(sI − A)−1 are in the open left half-plane. Show that the origin is stable iff all poles of every
element of (sI − A)−1 are in the closed left half-plane and those on the imaginary axis have
multiplicity 1.
12. Consider the linear system

0 1 −1
ẋ = x+ u
1 0 1

y = 0 1 x
(a) If u(t) is the unit step and x(0) = 0, is y(t) bounded?

(b) If u(t) = 0 and x(0) is arbitrary, is y(t) bounded?
13. (a) Suppose that σ(A) = {−1, −3, −3, −1 + j2, −1 − j2} and the rank of (A − λI)λ=−3 is 4.
Determine AJF .
(b) Suppose that σ(A) = {−1, −2, −2, −2} and the rank of (A − λI)λ=−2 is 3. Determine
AJF .
(c) Suppose that σ(A) = {−1, −2, −2, −2, −3} and the rank of (A − λI)λ=−2 is 3. Determine
AJF .
14. Find AJF for

 
0 1 0
A= 0 0 1 .
−2 −4 −3
15. Summarize all the ways to find exp(At). Then find exp(At) for
 
1 1 0
A =  0 1 1 .
0 0 2
16. Consider the set

{cv : c ≥ 0},
where v 6= 0 is a given vector in R2 . This set is called a ray from the origin in the direction
of v. More generally,
{x0 + cv : c ≥ 0}
is a ray from x0 in the direction of v. Find a 2 × 2 matrix A and a vector x0 such that the
solution x(t) of ẋ = Ax, x(0) = x0 is a ray.
17. Consider the following system:
ẋ1 = −x2
ẋ2 = x1 − 3x2
Do a phase portrait using Scilab or MATLAB. Interpret the phase portrait in terms of the
modal decomposition of the system. Do lots more examples of this type.
18. Prove the following facts about subspaces:
(a) V + V = V
Hint: You have to show V +V ⊂ V and V ⊂ V +V. Similarly for other subspace equalities.
(b) If V ⊂ W, then V + W = W.
(c) If V ⊂ W, then W ∩ (V + T ) = V + W ∩ T .
19. Show that W ∩(V +T ) = W ∩V +W ∩T is false in general by giving an explicit counterexample.
20. Let A be the identity LT on R2 . Take

1 1 2 −1
, = basis for domain, , = basis for co-domain.
1 −1 0 3
Find the matrix A.
21. Let A denote the LT R4 → R5 with the action
 
  x4
x1
 x2 

 0 

  7→  2x 4
.
 x3   
 x2 + x3 + 2x4 
x4
x2 + x3
Find bases for R4 and R5 so that the matrix representation is

I 0
A= .
0 0
22. Let A be an LT. Show that if {Ae1 , . . . , Aen } is linearly independent, so is {e1 , . . . , en }. Give
an example where the converse is false.
23. Find all right-inverses of the matrix

1 −1 1
A= .
1 1 2
24. Let X denote the 4-dimensional vector space with basis
{sin t, cos t, sin 2t, cos 2t}.
Thus vectors in X are time-domain signals of frequency 1 rad/s, 2 rad/s, or a combination of

both. Suppose an input x(t) from X is applied to a lowpass RC-filter, producing the output
y(t). The equation for the circuit is
RC ẏ(t) + y(t) = x(t).
For simplicity, take RC = 1. From circuit theory, we know that y(t) belongs to X too. (This
is steady-state analysis; transient response is neglected.) So the mapping from x(t) to y(t)
defines a linear transformation A : X −→ X . Find the matrix representation of A with
respect to the given basis.
25. Consider the vector space R3 . Let x1 , x2 , and x3 denote the components of a vector x in R3 .
Now let V denote the subspace of R3 of all vectors x where
x1 + x2 − x3 = 0,
and let W denote the subspace of R3 of all vectors x where
2x1 − 3x3 = 0.
Find a basis for the intersection V ∩ W.
26. Let A : R3 −→ R3 be the LT defined by

   
x1 8x1 − 2x3
A :  x2  7→  x1 + 7x2 − 2x3  .
x3 4x1 − x3
Find bases for Ker A and Im A.
27. Find all solutions of the matrix equation XA = I where

 
1 2
A= 1 0 .
2 −1
28. For a square matrix X, let diagX denote the vector formed from the elements on the diagonal
of X.
Let A : Rn×n −→ Rn be the LT defined by
A : X 7→ diagX.
Does A have a left inverse? A right inverse?
29. Consider the two matrices:

   
4 1 −1 1 2 3 4 5
 3 2 −3  ,  2 3 4 1 2  .
1 3 0 3 4 5 0 0
For each matrix, find its rank, a basis for its image, and a basis for its kernel.
30. Let A, U ∈ Rn×n with U nonsingular. True or false:
(a) Ker (A) = Ker (U A).

(b) Ker (A) = Ker (AU ).
(c) Ker (A2 ) ⊆ Ker (A).
31. Is {(x1 , x2 , x3 ) : 2x1 + 3x2 + 6x3 − 5 = 0} a subspace of R3 ?
32. You are given the n eigenvalues of a matrix in Rn×n . Can you determine the rank of the
matrix? If no, can you give bounds on the rank?
33. Suppose that A ∈ Rm×n and B ∈ Rn×m with m ≤ n and rank A = rank B = m. Find a
necessary and sufficient condition that AB be invertible.
34. Let A be an LT from X to X , a finite-dimensional vector space. Fix a basis for X and let A
denote the matrix representation of A with respect to this basis. Show that A2 is the matrix
representation of A2 .
35. Consider the following “result:”

Lemma If A is a matrix with full column rank, then the equation Ax = y is solvable for
every vector y.
Proof Let y be arbitrary. Multiply the equation Ax = y by the transpose of A:
AT Ax = AT y.
Since A has full column rank, AT A is invertible. Thus
x = (AT A)−1 AT y.
(a) Give a counterexample to the lemma.

(b) What is the mistake in logic in the proof?
36. Let L denote the line in the plane that passes through the origin and makes an angle +π/6
radians with the positive x-axis. Let A : R2 → R2 be the LT that maps a vector to its
reflection about L.
(a) Find the matrix representation of A with respect to the basis

1 −1
e1 = , e2 = .
1 1
(b) Show that A is invertible and find its inverse.
37. Fix a vector v 6= 0 in R3 and consider the LT A : R3 → R3 that maps x to the cross product
v × x.
(a) Find Ker(A) and Im(A).

(b) Is A invertible?
38. Preamble. This problem requires some notation. Suppose f (x, y) is a function of two real
variables, say, f (x, y) = x2 + xy. Then f (·, y) denotes the function x 7→ f (x, y) where y is
temporarily held constant; that is, x2 +xy considered as a function of x alone, with y constant.
So for each y, f (·, y) is a function, indicated by the dot acting as a placemarker. But then
y 7→ f (·, y) is another function, namely, it maps y to the function of x given by x2 + xy.
Therefore, f (·, y) is a function that maps a real number to a real number, whereas y 7→ f (·, y)
is a function that maps a real number to a function.
Another example where this situation comes up is a cart of mass M , input force u, and
output position y. Then y is a function of M and u; let’s write y = G(M, u). Then G(M, ·)
is the input-output map for a given M , and M 7→ G(M, ·) is the map from the mass to the
input-output system.
The problem Fix the n × n matrix A and consider the equation
ẋ = Ax, x(0) = x0 .
As you know, the state at time t starting from x0 at time 0 is x(t, x0 ) = eAt x0 .
(a) What are the domain and co-domain of the mapping
(t, x0 ) 7→ x(t, x0 )?
(b) What are the domain and co-domain of x(·, x0 )? Of x(t, ·)?
(c) Is the map x0 7→ x(·, x0 ) a linear transformation? Prove true or give a counterexample.
(d) Is the map t 7→ x(t, ·) a linear transformation? Prove true or give a counterexample.
Chapter 3
Calculus
In this chapter we review some calculus, including the method of Lagrange multipliers.
3.1 Jacobians
Suppose f : R −→ R is a function of class C 2 , twice continuously differentiable. The Taylor series
expansion of f at x is
df 1 d2 f 2 1 d3 f
f (x + ε) = f (x) + (x)ε + (x)ε + (x)ε3 + · · · .
dx 2! dx2 3! dx3
This extends to a function f : Rn −→ R. Thus, in the expression f (x), x is a vector with n
components and f (x) is a scalar. The Jacobian of f , denoted fx , is the 1 × n matrix (row vector)
whose j th element is ∂f /∂xj . We shall write the transpose of fx as ∇f , the gradient of f . Thus
∇f is a column vector.
Another way to think of the Jacobian is via the directional derivative. Let x and h be vectors
and ε a scalar. Consider f (x + εh) as a function of ε and think of its Taylor series at 0:
2 d2

d ε
f (x + εh) = f (x) + ε f (x + εh) + 2
f (x + εh) + ··· .
dε ε=0 2 dε ε=0
By the chain rule,

d
f (x + εh) = fx (x + εh)h
dε
and so

d
f (x + εh) = fx (x)h.
dε ε=0
Thus
f (x + εh) = f (x) + εfx (x)h + · · · .
The third term in the expansion is this:

d2 ε2

f (x + εh) .
dε2
ε=0 2
36
CHAPTER 3. CALCULUS 37
Now
d2 d
2
f (x + εh) = fx (x + εh)h.
dε dε
In more manageable terms, the right-hand side is (with the argument dropped and by use of the
chain rule again)

∂ ∂f ∂f
h1 + · · · + hn h.
∂x ∂x1 ∂xn
This in turn equals
hT fxx (x + εh)h,
where fxx (x) is the Hessian matrix, whose ij th element is
∂2f
.
∂xi ∂xj
Thus the first three terms in the Taylor series become
ε2 T
f (x + εh) = f (x) + εfx (x)h + h fxx (x)h + · · · .
2
This generalizes to a function f : Rn −→ Rm . Thus, in the expression f (x), x is a vector with
n components and f (x) is a vector with m components. So we can write
x = (x1 , . . . , xn ), f = (f1 , . . . , fm ).
The Jacobian of f , still denoted fx , is the m × n matrix whose ij th element is ∂fi /∂xj . The
derivative of f at the point x in the direction of the vector h is defined to be

d
f (x + εh) .
dε ε=0
This turns out to be a linear function of the vector h, and it must therefore equal M h for some
matrix M . In fact, M equals the Jacobian of f at x.
Example

m = 1, n = 2, f (x) = c1 x1 + c2 x2 , fx (x) = c1 c2
More generally, if f (x) = cT x, then fx (x) = cT . This can be derived like this:
f (x + εh) = cT (x + εh)
= cT x + εcT h
d
f (x + εh) = cT h
dε

d
= cT h

f (x + εh)
dε ε=0
fx (x) = cT .

Example If
f (x) = kxk2 = x21 + · · · + x2n ,
then fx (x) = 2xT . More generally, consider f (x) = xT Qx, where Q is a symmetric matrix. You
can derive that fx (x) = 2xT Q. If Q is not symmetric, then
fx (x) = xT (QT + Q).
Example If f (x) = x(kxk2 − 1), f : Rn −→ Rn , then
f (x + εh) = (x + εh)(kx + εhk2 − 1)

= (x + εh)(kxk2 − 1 + 2εxT h + ε2 khk2 )
= x(kxk2 − 1) + εkxk2 h + 2εxxT h − εh + HOT.
Thus
fx (x) = (kxk2 − 1)I + 2xxT .
3.2 Optimization over an Open Set

A subset V of Rn is open if every point in V has the property that “it lies inside V,” that is,
(∀x ∈ V)(∃ε > 0)(∀y)ky − xk < ε =⇒ y ∈ V.
Given a function f : Rn −→ R and an open set V in Rn , the problem is to maximize f (x) subject
to x ∈ V. Of course, minimizing f is the same as maximizing −f , so we’re solving that problem
too.
We say xo is a global maximizer if
xo ∈ V, (∀x ∈ V) f (xo ) ≥ f (x).
We say xo is a local maximizer if
xo ∈ V, (∃ε > 0)(∀x ∈ V)kx − xo k < ε =⇒ f (xo ) ≥ f (x).
Minimizer has the obvious definition, and optimizer refers to a point that’s either a maximizer of
minimizer. Finally, we say f is of class C r , or f is C r , if all partial derivatives of f of order up to
r exist and are continuous.
First, the necessary condition:
Lemma 3.1 If f is C 1 and xo is a global or local maximizer, then fx (xo ) = 0.

Proof Let h be arbitrary. For every ε,
f (xo + εh) = f (xo ) + εfx (xo )h + o(ε),
where o(ε)/ε converges to 0 as ε converges to 0. That’s what little o means. Since xo is a local or
global maximizer and V is open,
f (xo ) ≥ f (xo + εh)
for every ε sufficiently small. Thus

o(ε)
fx (xo )h + ≤0
ε
for sufficiently small ε > 0. Thus
fx (xo )h ≤ 0.
Since h was arbitrary, fx (xo ) = 0
Example
Fermat’s principle (1662) is that the path between two points taken by a beam of light is the one
that is traversed in the least time. Snell’s law of refraction follows directly from this statement.
To prove this, consider a light ray from a fixed point p1 to a fixed point p2 . The two points are in
different media, where the speeds of light in the media are respectively c/n1 , c/n2 .
n1 n2
p1
θ1 p
θ2
p2
Let p denote the point where the ray from p1 to p2 passes through the interface of the media. We
want to find p according to Fermat’s principle. Orient an (x, y) coordinate system as shown by the
axes. The time for a ray to pass from p1 to p2 is
n1 n2
J= kp1 − pk + kp − p2 k.
c c
Let
p1 = (x1 , y1 ), p2 = (x2 , y2 ), p = (0, y) = y(0, 1) = ye2

(e2 is the unit vector along the y-axis). So J is a function of y,

n1 n2
J(y) = kp1 − ye2 k + kye2 − p2 k,
c c
and the problem is to find y to minimize J.
It can be shown that J(y) is a continuously differentiable function with a unique minimum.
Using the chain rule we have
dJ n1 1 n2 1
(y) = − (p1 − ye2 )T e2 − (p2 − ye2 )T e2 ,
dy c kp1 − ye2 k c kye2 − p2 k
which simplifies to
dJ n1 y − y1 n 2 y − y2
(y) = +
dy c kp1 − ye2 k c kye2 − p2 k
and further to
dJ n1 n2
(y) = − sin θ1 + sin θ2 ,
dy c c
where the angles are shown in the figure. Setting the derivative to 0 gives
n1 sin θ1 − n2 sin θ2 = 0.
Thus
n1 sin θ2
= ,
n2 sin θ1
which is Snell’s law.
For the sufficient condition, we need the concept of positive definite matrix. Let Q be a real,
square matrix. It is positive semi-definite (written Q ≥ 0) if xT Qx ≥ 0 for all x. It is positive
definite (written Q > 0) if xT Qx > 0 for all x 6= 0. Negative definite and semi-definite are defined
in the obvious way. If Q is symmetric, then it is positive semi-definite iff all its eigenvalues are ≥ 0,
and positive definite iff they’re all positive.
Let H(x) denote the Hessian of f .
Lemma 3.2 If f is C 2 and xo satisfies
xo ∈ V, fx (xo ) = 0, H(xo ) < 0,
then xo is a local maximizer.
Proof Fix h 6= 0. Then

ε2 T
f (xo + εh) = f (xo ) + h H(xo )h + o(ε2 ),
2
This can be written
ε2 T o(ε2 )

o o o
f (x + εh) = f (x ) + h H(x )h + 2 .
2 ε
The first term inside the brackets is negative, while the second term goes to zero as ε → 0. Thus
for ε sufficiently small
f (xo + εh) ≤ f (xo ).
Since h was arbitrary, xo is a local maximizer.
Example
1
f (x) = xT Ax + bT x + c, A symmetric
2
We have
fx (x) = xT A + bT , fxx (x) = A.
Thus a local optimum exists only if the equation
xT A + bT = 0
has a solution; that is, b belongs to the span of the columns of A. Then, if xo is a solution of this
equation and if A is negative definite, then xo is a local maximizer.
3.3 Optimizing a Quadratic Function with Equality Constraints

Let’s begin with a very simple example:
Example In the plane, find the point on a given line that is closest to a given point:
given point
given line
optimal point
This is a distance problem. Obviously, you can get the closest point by drawing the perpendicular
from the given point to the given line.
Before we solve this problem, let’s clarify some notation. The norm of x = (x1 , x2 ) is
1/2
kxk = x21 + x22 ,
and this can also be written kxk = (xT x)1/2 , that is,

T
x1
= x21 + x22 .

x x = x1 x2
x2
To develop a solution method, suppose the given point is v = (1, 2) and the equation of the
given line is
x2 = 0.5x1 + 0.2.
Let x = (x1 , x2 ) be the point being sought. Define
cT = −0.5 1 , b = 0.2.

Then x is on the line iff cT x = b. Also, the distance from v to x is kv − xk. Note that kv − xk
is minimum iff kv − xk2 is minimum. Thus we have arrived at the following equivalent problem:
minimize the quadratic function kv − xk2 of x subject to the equality constraint cT x = b. Notice
that
kv − xk2 = (v − x)T (v − x) = v T v − v T x − xT v + xT x.
The right-hand side is a quadratic function of x. Since xT v = v T x (dot product of real vectors is
symmetric), we have
kv − xk2 = v T v − 2v T x + xT x.
So we’ve reduced the problem to
min v T v − 2v T x + xT x.
x, cT x=b
We’ll return to this after we review some calculus.

Aside: This specific problem is easy to solve this way: Substitute the constraint x2 = 0.5x1 + 0.2
into
(1 − x1 )2 + (2 − x2 )2 ,
to get a function f (x1 ). Set the derivative of f to zero, solve for x1 , then get x2 . The answers are
x1 = 1.52, x2 = 0.96.
Lagrange Multipliers
Now we return to the first example in this section. It had the form
min f (x), f (x) = v T v − 2v T x + xT x, cT = −0.5 1 , b = 0.2.

x, cT x=b
We are going to use the method of Lagrange multipliers. The idea is to absorb the constraint
cT x = b, or equivalently cT x − b = 0, into the function being minimized, leaving an unconstrained
problem. Define the Lagrangian
L(x, λ) = f (x) + λ(cT x − b).

Here λ is an unknown that multiplies the constraint equation. It turns out a necessary condition
for optimality of x is that L should be stationary with respect to both x and λ, that is,
Lx = 0, Lλ = 0.
These two equations give
fx + λcT = 0, cT x − b = 0,
or, using the form of f ,
−2v T + 2xT + λcT = 0, cT x − b = 0.
Finally, taking transpose and rearranging, we have
2x + λc = 2v, cT x = b.
These can be assembled into one equation:

2I c x 2v
= .
cT 0 λ b
Let’s put in our values for v, c, b:

    
2 0 −0.5 x1 2
 0 2 1   x2  =  4  .
−0.5 1 0 λ 0.2
This has a unique solution because the matrix is invertible:
x = (1.52, 0.96), λ = 2.08.
The x is the optimal x, the closest point, and the λ can be discarded—it was introduced only to
solve the problem.
Let’s look at a somewhat more general problem by the Lagrange multiplier method.
Example We’ll solve the problem
minimizex kc − Axk
subject to the constraint Bx = d. Here x, c, d are vectors and A, B matrices. Assume A has full
column rank and B has full row rank.
Define
J(x) = kc − Axk2 = (c − Ax)T (c − Ax)
= cT c − cT Ax − xT AT c + xT AT Ax
= cT c − 2cT Ax + xT AT Ax
and
L(x, λ) = J(x) + λT (Bx − d).

Here the Lagrange multiplier has to be a vector. Differentiating with respect to x then λ, we get
−2cT A + 2xT AT A + λT B = 0, Bx − d = 0.
Transposing the first gives
−2AT c + 2AT Ax + B T λ = 0, Bx − d = 0.
Collect as one equation:
2AT A B T 2AT c

x
= .
B 0 λ b
If it can be proved that the matrix on the left is invertible, then the optimal x is
2AT A B T −1 2AT c

x= I 0 .
B 0 b
So let’s see that the matrix
2AT A B T

B 0
is invertible. It suffices to prove that the only solution to the homogeneous equation
2AT A B T

x
=0
B 0 λ
is the trivial solution. So start with
2AT A B T

x
= 0.
B 0 λ
Thus
2AT Ax + B T λ = 0, Bx = 0.
Since A has full column rank, the matrix AT A is positive definite, hence invertible. Thus
x + (2AT A)−1 B T λ = 0, Bx = 0.
Multiply the first equation by B and use the second:
B(2AT A)−1 B T λ = 0.
Pre-multiply by λT :
λT B(2AT A)−1 B T λ = 0.
Since (2AT A)−1 is positive definite, it follows that B T λ = 0. Then, since B T has full column rank,
λ = 0. Finally, from the equation
x + (2AT A)−1 B T λ = 0,
we get that x = 0. Thus x = 0, λ = 0 is the only solution of
2AT A B T

x
= 0.
B 0 λ

Why the Lagrange multiplier method works

Consider the problem of minimizing a function f (x) subject to an equality constraint g(x) = 0.
To be able to draw pictures, let’s suppose
f, g : R2 −→ R.
The set of all x satisfying the constraint g(x) = 0 typically is a curve. For a given constant c, the
set of all x satisfying f (x) = c is called a level set of f . Now assume xo is a locally optimal point
for the problem ming(x)=0 f (x). That is, if x is nearby xo and g(x) = 0, then f (x) > f (xo ).
Claim The gradients ∇f (xo ), ∇g(xo ) are collinear.
Proof The picture near xo looks like this:
level sets of f
∇f f decreasing
∇g
x∗
g=0
From this, the claim is clear.
Thus there is a scalar λo such that ∇f (xo ) + λo ∇g(xo ) = 0. This implies the gradient of the
function
f (x) + λo g(x)
equals zero at xo . Finally, this implies the Lagrangian
L(x, λ) = f (x) + λg(x)
satisfies
Lx (xo , λo ) = 0, Lλ (xo , λo ) = 0.
In conclusion, a necessary condition for a point xo to be a local optimum for the problem ming(x)=0 f (x)
is that there exist a point λo such that the derivative of the Lagrangian L(x, λ) equals zero at xo , λo .
3.4 Optimization with Equality Constraints

Now we take a more general view. Namely, the problem is to maximize f (x) over all x satisfying
the constraint g(x) = 0. Here
f : Rn −→ R, g : Rn −→ Rm .
We begin with some definitions. A subset A of Rn is closed if it contains the limit of every
convergent sequence of points in A; that is, if {xk } is a sequence in A that converges to a point in
Rn , then that limit is actually in A. A subset A of Rn is bounded if there exists r > 0 such that
kxk ≤ r for every x ∈ A. A closed and bounded set is said to be compact. (This is not actually
the definition of a compact set, but in finite dimensional space it’s equivalent.)
Now we look at the constraint set C := {x : g(x) = 0}. If g is continuous, C is closed. This is
pretty immediate: If {xk } is a sequence such that g(xk ) = 0 for all k and if the sequence converges
to, say, x, then g(x) = 0 by continuity.
Now it’s a fact from analysis that a continuous function on a compact set achieves its maximum.
Thus if f and g in the given problem are continuous and if C is bounded (and therefore compact),
then the problem
max f (x)
x∈C
is solvable—there is a maximizer. The trouble is that frequently C isn’t bounded; of course, this
doesn’t imply there isn’t a maximizer.
Now we see the mathematical justification of the method of Lagrange multipliers. For the
problem at hand, define the Lagrangian
L(x, λ) = f (x) + λT g(x).
Theorem 3.1 Suppose f, g are C 2 , the problem maxC f (x) has a local solution xo , and gx (xo ) is
surjective. Then there exists a vector λo such that
Lx (xo , λo ) = 0, Lλ (xo , λo ) = 0.
Proof We’ll do only the simpler case where g is linear, g(x) = Ax. Then gx (x) = A and the
hypothesis is that A has rank m. Suppose without loss of generality that

A= B C ,
B invertible. Partition x correspondingly:
x = (y, z), y ∈ Rm .
Then
g(y, z) = 0
⇐⇒ By + Cz = 0
⇐⇒ y = −B −1 Cz.
So the constrained problem
max f (x)
g(x)=0
is equivalent to the unconstrained problem
max f (−B −1 Cz, z).

z
Define
h(z) = f (−B −1 Cz, z).
Since xo is an optimizer,
z o = 0 I xo

is an optimizer of h. Thus we have in turn
hz (z o ) = 0
−fy (y o , z o )B −1 C + fz (y o , z o ) = 0
−B −1 C

o
fx (x ) = 0. (3.1)
I
Define
−B −1

∗T o
λ = fx (x ) .
0
Then
Lx (xo , λo ) = fx (xo ) + λ∗T gx (xo )

= fx (xo ) + λ∗T A
= fx (xo ) + λ∗T B C

−B −1

o o

= fx (x ) + fx (x ) B C
0
−I −B −1 C

= fx (xo ) + fx (xo )
0 0
−1

0 −B C
= fx (xo ) .
0 I
The last right-hand side equals 0 from (3.1).

Finally, the other equation, Lλ (xo , λo ) = 0 is satisfied because
Lλ = g(x).

3.5 Application: Sensor Placement

Where should we place sensors to optimize coverage?
Lloyd’s algorithm in 1D
This algorithm was originally developed for the problem of quantizing data. Let r be a real number
that could take any value in the interval [0, 1]. We want to partition [0, 1] into a finite number,
n, of subintervals, {Vi }i=1,...,n , and then, for each i, designate one point pi in Vi as the codeword.
Then the quantization function would be to map r to pi if r ∈ Vi . The partition and code book are
optimal in a certain sense.
There’s a minor but annoying difficulty with the boundaries of {Vi }i=1,...,n . Strictly speaking
these intervals should not overlap; that is, every point should be in one and only one Vi . But this
complicates the derivation to the point where it obscures the ideas. So we’ll take the subintervals
to be closed and ignore the case where a point lies on the boundaries of two subintervals.
The algorithm is illustrated by an example.
Example (n = 3) Let p1 < p2 < p3 be three arbitrary points in [0, 1]. Construct a partition
{V1 , V2 , V3 } as shown here:
V1 V2 V3
p1 p2 p3
So V1 is from 0 to the midpoint between p1 and p2 , (p1 + p2 )/2; V2 is from (p1 + p2 )/2 to (p2 + p3 )/2;
and V2 is from (p2 + p3 )/2 to 1. This is called the Voronoi partition1 V generated by {pi }; the
intervals are uniquely defined by this property: Vi is the set of all points q whose distance from pi
is less than or equal to the distances from all other pj :
Continuing with the algorithm, update pi to be the centre ci of Vi :
V1 V2 V3
p1 p2 p3
Then update Vi to be the Voronoi partition:
V1 V2 V3
p1 p2 p3
1
Named after the Ukrainian mathematician Georgy Voronoi (1868–1908).
And so on. Does this procedure converge? Let p be the vector (p1 , p2 , p3 ). Then the update law is
p(k + 1) = Ap(k) + b,
where b = (0, 0, 1/2) and

 
1 1 0
1
A= 1 2 1 .
4
0 1 1
You’re invited to prove that p(k) converges to the vector (1/6, 1/2, 5/6):
V1 V2 V3
p1 p2 p3
Thus the intervals have equal width and the points are their centres, just what you’d like for a
quantizer.
Continuous time
There’s a natural continuous-time version of the algorithm.
Example (continued) Think now of pi , ci , and Vi as evolving in continuous time (ci is the centre
of Vi ):
ṗ1 = c1 − p1 , ṗ2 = c2 − p2 , ṗ3 = c3 − p3 .
This leads to
ṗ = Ap + b,
where b = (0, 0, 1/2) and

 
−3 1 0
1
A =  1 −2 1 .
4
0 1 −3
Again, p(t) converges to the vector (1/6, 1/2, 5/6).
Lloyd’s algorithm in 2D
Consider a convex polytope W in R2 of area AW . Its centroid (point of balance) cW satisfies
Z
(q − cW )dq = 0,
W
and therefore
1
Z
cW = qdq.
AW W
cW
The polar moment of inertia of W about a point p ∈ W is

Z
H(p, W ) = kq − pk2 dq.
W
The parallel axis theorem is a standard result in mechanics:
Lemma 3.3 For every p in W
H(p, W ) = H(cW , W ) + AW kp − cW k2 .
Proof The statement is
Z Z Z
2 2
kq − pk dq = kq − cW k dq + kp − cW k2 dq.
W W W
Now the quantity
Z 1/2
2
kf (q)k dq
W
is a norm on the function f : W −→ R2 , so let’s write this quantity as kf k. Specifically, define

f (q) = q − cW (affine-linear function) and g(q) = cW − p (constant function). Then we’re trying to
show
kf + gk2 = kf k2 + kgk2 ,
which is an instance of Pythagoras’ theorem. So all we have to show is that f ⊥ g:
Z
hf, gi = f (q)T g(q)dq
ZW
= (q − cW )T (cW − p)dq
ZW Z
T
= q dq(cW − p) − dqcTW (cW − p)
W W
= AW cTW (cW − p) − AW cTW (cW − p)
= 0.

Corollary 3.1 The unique point p that minimizes H(p, W ) is the centroid, p = cW .
Let’s interpret that last result in terms of sensor placement. Suppose we want to place a sensor
at a location p in W to optimize coverage. We take H(p, W ) as the cost function, a measure of
coverage error—the sum over q of the squares of the distances kq − pk. Then the optimal location
for the sensor is the centroid.
Fixed partition
Let’s extend to n sensors. Consider a convex polytope Q in R2 . Suppose W = {Wi }i=1,...,n is a
given partition:
W1
Q
W3
W2
Now suppose there are n sensors that are to be placed at locations {pi }, one in each cell: pi ∈ Wi :
W1
p1
W3 Q
p2 p3
W2
The cost function for cell i is H(pi , Wi ) and the total cost function is
H(p, W) = H(p1 , W1 ) + · · · + H(pn , Wn ),
where p = (p1 , . . . , pn ) denotes the vector of sensor positions. Since W1 , . . . , Wn are all disjoint,
min H(p, W) = min H(p1 , W1 ) + · · · + min H(pn , Wn ).

p p1 pn
Thus the optimal pi is the centroid of Wi .
Fixed sensor locations

Now let’s suppose the n sensor locations p are fixed but the n cells W = {Wi } are to be designed.
The optimal partition turns out to be the Voronoi partition V:
V1
p1 Q
p2 p3
V2
V3
In mathematical terms,
Vi = {q : (∀j 6= i)kq − pi k ≤ kq − pj k}.
Each Vi is the intersection of half planes. The picture just shown is called a Voronoi diagram, and
the partition is uniquely determined by p.
Lemma 3.4 For a given p, the unique partition that minimizes H(p, W) is the Voronoi partition,
W = V.
Proof Let’s do the case n = 2 for simplicity of explanation. Here’s the picture:
W1
Q
p1 V2
V1 p2
W2
The solid line through Q defines the Voronoi partition; it bisects the line joining p1 and p2 . Let W
be any other partition, shown by the dashed line. We’ll show that
H(p, V) ≤ H(p, W),
that is,
Z Z Z Z
kq − p1 k2 dq + kq − p2 k2 dq ≤ kq − p1 k2 dq + kq − p2 k2 dq. (3.2)
V1 V2 W1 W2
Let χV denote the characteristic function of a set V , that is,
χV (q) = 1 if q ∈ V, χV (q) = 0 if not.

Then (3.2) is equivalent to

Z
kq − p1 k2 χV1 (q) + kq − p2 k2 χV2 (q) dq ≤

Q
Z
kq − p1 k2 χW1 (q) + kq − p2 k2 χW2 (q) dq.

Q
So it suffices to prove that for every q
kq − p1 k2 χV1 (q) + kq − p2 k2 χV2 (q) ≤ kq − p1 k2 χW1 (q) + kq − p2 k2 χW2 (q). (3.3)
Let q ∈ V1 . If q ∈ W1 , then
kq − p1 k2 χV1 (q) = kq − p1 k2 χW1 (q)
and so (3.3) is true with equality; whereas, if q ∈ W2 , then
kq − p1 k2 χV1 (q) ≤ kq − p2 k2 χW2 (q)
and so (3.3) is true.
Likewise if q ∈ V2 .

Fixed number of sensors

Now we turn to the more interesting problem: Both the sensor locations and the cells are designable—
only the overall set Q and the number n of sensors are given and fixed. The problem is to minimize
H(p, W) over both p and W.
Lloyd’s algorithm is this:
Step 0: Start with an arbitrary partition {Wi } and arbitrary points {pi }, pi ∈ Wi .
Step 1: Construct the unique Voronoi partition {Vi } generated by {pi }.
Step 2: Update pi to be the centroid of Vi .
Return to Step 1.
The rationale for the algorithm is this: Regarding Step 1, by Lemma 3.4
H(p, W) ≥ H(p, V).
Then, regarding Step 2, by Lemma 3.3
X
H(cVi , Vi ) + AVi kpi − cVi k2

H(p, V) =
i
X
≥ H(cVi , Vi )
i
= H(pupdated , V).
The procedure converges asymptotically to a Voronoi partition with pi being the centroid of Vi .
However, the limit may be only a local optimum for the function H. For example:
If the algorithm is initialized in either of the two ways shown, it terminates immediately. But the
right-hand value of H is larger than the left-hand value.
References
1. J. Cortes, S. Martinez, T. Karatas, and F. Bullo, “Coverage control for mobile sensing net-
works,” IEEE Transactions on Robotics and Automation, 20(2): 243–255, 2004.
2. S. P. Lloyd, “Least squares quantization in PCM,” IEEE Transactions on Information Theory,

vol. 28, no. 2, pp. 129–137, 1982; the material in this paper was presented in part at the
Institute of Mathematical Statistics Meeting, Atlantic City, NJ, September, 1957.
3.6 Problems
1. Let f : R −→ R be defined by
2
x sin(1/x), x 6= 0
f (x) =
0, x = 0.
Prove that f is differentiable but its derivative is not continuous.
2. Define f : R2 −→ R by
x2 − x22

x1 x2 21 , x 6= 0

f (x1 , x2 ) = x1 + x22
0, x = 0.

Find fxx (0).
3. Consider
f (x1 , x2 ) = (x2 − x21 )2 + (1 − x1 )2 .
Find the global minimizer, if it exists.
4. Consider the problem of maximizing f (x) subject to g(x) = 0, where

1
f (x) = xT Ax + bT x + c, g(x) = Dx.
2
Assume A is symmetric and A < 0, and D is surjective. What can you conclude about
existence and uniqueness of a solution?
5. Here we apply the theory to a state-space regulation problem. Consider
x(k + 1) = Ax(k) + Bu(k), A ∈ Rn×n , B ∈ Rn×m .
Suppose, given x(0), we want to drive the state x(k) to the origin. There are at least two
ways to do this. The feedback method is to choose F so that A + BF is nilpotent, if possible,
and then set u = F x. We’ll look at the other method, open-loop control. We have
x(1) = Ax(0) + Bu(0)
and so on until
x(n) = An x(0) + W ũ,
where
An−1 B

W = B AB · · · , ũ = (u(n − 1), . . . , u(0)).
We assume (A, B) is controllable. Therefore W is surjective and there exists ũ such that
x(n) = 0. We propose to choose ũ such that x(n) = 0 and kũk2 is minimum. The idea
is to drive the state to the origin using minimum energy. This is precisely our constrained
optimization problem with
f (ũ) = kũk2 , g(ũ) = An x(0) + W ũ.
Solve this optimization problem.

6. Consider the nonlinear differential equation
ẋ = −x(kxk2 − 1),
where x is a vector. Find the equilibrium points. Linearize the equation at every equilibrium
and see if you can conclude anything about local stability.
7. Let f : Rn −→ Rn be given by
1
f (x) = x.
kxk
The norm is the Euclidean norm, kxk = (xT x)1/2 . The definition of f makes sense as long as
x 6= 0; for completeness you can set f (0) = 0 (the value is irrelevant). Compute fx (x), the
Jacobian of f at x 6= 0.
8. Solve the problem
min kc − Axk.
x
Show that there always exists a solution. When is the solution unique?
9. Find the vector in AKerB that is closest to c, where
   
1 1 1
A = 2 −2 , B = 1 1 , c = 1  .
  
3 0 1
10. Show that the equation AT Ax = AT b is always solvable.
11. This is a problem on sensor placement. Suppose we want to place a sensor to detect, say,
temperature.
(a) Suppose the workspace is the unit interval [0, 1]. We want to place a sensor at a location
p ∈ [0, 1] to get “optimal coverage.” To make this precise, we have to define a measure
of coverage error, denoted H(p). Here are some options:
If q is another point, how well the sensor can measure the temperature at location q
depends on the distance |q − p|. Suppose the temperature error is in fact proportional
to |q − p|. Then the average error is proportional to
Z 1
H1 (p) = |q − p|dq
0
while the worst case error is proportional to
H∞ (p) = max |q − p|.

0≤q≤1
Suppose the temperature error is in fact proportional to |q − p|2 . Then the average error
is proportional to
Z 1
H2 (p) = |q − p|2 dq.
0
Find the optimal p (the one that minimizes H) in each of the three cases. Is the optimal
p the centre of the interval in all three cases?
(b) Now consider the extension to a convex polygon region in R2 . As in the course notes,
for H2 (p), the optimal p is the centroid of the region. Give an example where this isn’t
true for H1 (p) or H∞ (p).
Part I
Classical Theories
57
Chapter 4
Calculus of Variations
References: Calculus of Variations, I. M. Gelfand and S. V. Fomin; Optimization by Vector Space

Methods, D. G. Luenberger.
4.1 The Brachistochrone Problem

“Optimal control was born in 1697—300 years ago-in Groningen, a university town in the north
of The Netherlands, when Johann Bernoulli, professor of mathematics at the local university from
1695 to 1705, published his solution of the brachystochrone problem.” So begins the article
“300 Years of Optimal Control: From the Brachistochrone to the Maximum Principle,”
H.J. Sussman and J.C. Willems, IEEE Control Systems Magazine, 1997,
Here we study the brachistochrone problem and in a later chapter, the maximum principle. Brachis-
tochrone means “shortest time.”
A tiny spherical wooden bead with a hole drilled through it slides from rest without friction
along a rigid wire:
The starting point A is higher in elevation than the end point B, and the curve of the wire lies in
a vertical plane like this:
58
CHAPTER 4. CALCULUS OF VARIATIONS 59
A x
y B
We want the bead to slide under the force of gravity from A to B. The two points A and B are fixed
in space, but the wire curve is free for us to design. For what curve does the bead slide from A to
B in minimum time? It’s not the straight line from A to B.
This is the brachistochrone problem. It was worked on by Newton, the Bernoulli brothers,
and other great scientists. The problem is harder than a simple calculus problem because we’re
looking for an optimal curve instead of an optimal number or vector. The space of curves is infinite
dimensional. Let a candidate curve be
y(x), at A : y(0) = 0, at B : y(x1 ) = y1 .
So x1 , y1 are given and the curve y(x) is to be found.

Let t1 denote the time it takes for the bead to slide from A to B. We have to bring in some
physics to specify the time t1 as a function of the curve. Let the bead start from A at time t = 0,
let s denote the distance along the curve from A to where the bead is at time t, and let v = ṡ,
where dot denotes derivative with respect to t.
Claim v 2 = 2gy
This follows from the conservation of energy:
1
mv 2 = mgy.
2
But here’s another derivation:
Proof of claim The force vertically down on the bead is mg and therefore the force tangent to
the curve is
dy
mg .
ds
Newton’s second law gives
dy
mg = ms̈,
ds
i.e.,
dy
g = s̈.
ds
Multiply by 2ṡ:
2g ẏ = 2ṡs̈.
Integrate:
Z t Z t
2g ẏdτ = 2ṡs̈dτ.
0 0
Thus
2gy + c = ṡ2 = v 2 .
At t = 0, y = v = 0; therefore c = 0 and we have proved the claim.
Next, we have
ds2 = dx2 + dy 2 ,
and therefore
ṡ2 = ẋ2 + ẏ 2 = ẋ2 + y 02 ẋ2 .
where prime denotes derivative with respect to x. On the left replace ṡ2 by v 2 = 2gy:
2gy = (1 + y 02 )ẋ2 .
Thus
s
1 + y 02
dt = dx.
2gy
Integrate:
s
x1
1 + y 0 (x)2
Z
t1 = dx.
0 2gy(x)
Changing notation, we have arrived at the problem of finding a curve x(t) to minimize
t2 1/2
1 + ẋ2
Z
J(x) = dt
t1 2gx
subject to the constraints that x(t1 ), x(t2 ) are fixed.

4.2 The General Problem

The general problem involves
Z t2
J(x) = f [t, x(t), ẋ(t)]dt,
t1
where x(t1 ) = x1 , x(t2 ) = x2 are fixed. The function f maps R × Rn × Rn to R and is assumed to
be of class C 2 .
Let us denote by X the vector space of C 1 functions x : R −→ Rn and by Xa the subset such
that x(t1 ) = x1 and x(t2 ) = x2 . This latter is the set of admissible curves. The problem is to find
x ∈ Xa to minimize J(x).
Thus
J : X −→ R.
A function like this whose domain is a function space and whose co-domain is the reals is usually
called a functional. Other interesting examples are the length of a curve, the area surrounded by
a closed curve, etc.
4.3 The Euler-Lagrange Equation

The Euler-Lagrange equation is a necessary condition for a function to be optimal.
Theorem 4.1 If xo ∈ Xa minimizes J, then xo satisfies the equation

d
fx = fẋ ,
dt
that is, for all t1 ≤ t ≤ t2
d
fx [t, xo (t), ẋo (t)] = fẋ [t, xo (t), ẋo (t)].
dt
Let’s finish the brachistochrone problem before the proof.
Example The brachistochrone problem.

We have
1/2
1 + ẋ2

f (t, x, ẋ) = .
2gx
Thus
1/2
1 + ẋ2

1
fx = −
2x 2gx
1 ẋ
fẋ = √
2gx (1 + ẋ2 )1/2
d ẋ ẋ 1 ẍ
fẋ = − √ +√ .
dt 2x 2gx (1 + ẋ2 )1/2 2gx (1 + ẋ2 )3/2
Thus the Euler-Lagrange equation is
1/2
1 + ẋ2

1 ẋ ẋ 1 ẍ
− =− √ 1/2
+√ .
2x 2gx 2
2x 2gx (1 + ẋ ) 2gx (1 + ẋ2 )3/2
This simplifies to
2xẍ + ẋ2 + 1 = 0.
Let’s return to the original variables: y(x) instead of x(t):
2yy 00 + y 02 + 1 = 0. (4.1)
Instead of solving this equation, it’s easier to propose a solution and then verify it. The path is a
cycloid, that is, a curve generated by a fixed point moving on a rolling wheel:
rθ
x
(x, y)
r θ
The wheel rolls along the x-axis as shown. The black dot traces out a cycloid. Thus r is constant,
θ(t) is a function of time, θ(0) = 0, and
x = rθ − r sin θ, y = r − r cos θ.
Let us verify that this path satisfies (4.1). We have
ẏ rθ̇ sin θ sin θ

y0 = = =
ẋ rθ̇(1 − cos θ) 1 − cos θ
1 d 0 1 d sin θ 1
y 00 = y = =− .
ẋ dt rθ̇(1 − cos θ) dt 1 − cos θ r(1 − cos θ)2
Substitute these into (4.1).
For the terminal point (2, −1), the graph is this:
It’s interesting that the curve goes up near the end.
The proof of Theorem 4.1 requires a lemma. Recall the spaces X and Xa . Define X0 to be the
subspace of X of functions h(t) that equal zero at the two times t1 , t2 .
Lemma 4.1 Suppose y(t) is a continuous function, y(t) ∈ Rn , and

Z t2
(∀h ∈ X0 ) y(t)T ḣ(t)dt = 0.
t1
Then y(t) is a constant vector on [t1 , t2 ].
Proof Define the vector c via

Z t2
[y(t) − c]dt = 0
t1
and let
Z t
h(t) = [y(τ ) − c]dτ.
t1
Then h ∈ X0 and
Z t2 Z t2
ky(t) − ck2 dt = [y(t) − c]T [y(t) − c]dt
t1 t1
Z t2
= [y(t) − c]T ḣ(t)dt
t1
Z t2
= y(t)T ḣ(t)dt − cT [h(t2 ) − h(t1 )]
t1
= 0.
Thus y(t) = c.
Proof of Theorem 4.1 Let h ∈ X0 . Then for every ε, xo + εh is in Xa and so J(xo + εh) has a
minimum at ε = 0. Thus

d o

J(x + εh) = 0.
dε ε=0
We have
t2

d d
Z
o o o

J(x + εh) = f [t, x + εh, ẋ + εḣ] dt
dε ε=0 t1 dε ε=0
Z t2
= fx (t, xo , ẋo )h + fẋ (t, xo , ẋo )ḣdt
t1
Therefore we have
Z t2
fx (t, xo , ẋo )h + fẋ (t, xo , ẋo )ḣdt = 0.
t1
Now if we knew (d/dt)fẋ exists, we could integrate by parts here.

Next, define
Z t
g(t) = fx (τ, xo (τ ), ẋo (τ ))dτ.
t1
Then integrate by parts:

Z t2 Z t2
o o
fx (t, x , ẋ )hdt = − g ḣdt.
t1 t1
Thus
Z t2
d o
[−g + fẋ (t, xo , ẋo )]ḣdt.

J(x + εh)
=
dε ε=0 t1
Then the lemma gives that
−g + fẋ (t, xo , ẋo ) = constant.
Differentiating with respect to t we get the Euler-Lagrange equation.
Example
ẋ = Ax + u
Given x(0), find the minimum energy u such that x(1) = 0. To set this up, we have
Z 1 Z 1
2
ku(t)k dt = kẋ(t) − Ax(t)k2 dt.
0 0
So we define
f (t, x, ẋ) = kẋ − Axk2 .

Then
fx = −2ẋT A + 2xT AT A
fẋ = 2ẋT − 2xT AT .

The Euler-Lagrange equation is
−2ẋT A + 2xT AT A = 2ẍT − 2ẋT AT .
This reduces to
ẍ + (A − AT )ẋ − AT Ax = 0.
Thus if there’s an optimal state xo , it satisfies
ẍo + (A − AT )ẋo − AT Axo = 0, xo (0) given, xo (1) = 0.
This is a two-point boundary-value problem. (See one of the exercises.) The corresponding optimal
control is uo = ẋo − Axo .
4.4 Problems
1. In the (t, x)-plane, the problem is to find the curve x(t) from (t1 , x1 ) to (t2 , x2 ) of minimum
length (a straight line). Formulate the problem by writing the length of a curve as an integral
with respect to t, and solve by the calculus of variations.
2. Consider (x, y, t)-space and consider a curve x(t) in the (t, x)-plane from (t1 , x1 ) to (t2 , x2 ).
Rotate this curve about the t-axis. The surface area is
Z t2 p
2π x 1 + ẋ2 dt.
t1
Find the curve that minimizes the surface area.
(t1 , x1 )
(t2 , x2 )
3. Consider the two-point boundary-value problem
ẍ = AT Ax, x(0) given, x(1) given.
Show that it has a unique solution x(t).

4. Preamble Consider the problem of Section 3.2, minimizing

Z t2
J(x) = f (t, x(t), ẋ(t))dt,
t1
where x(t) ∈ R. The values t1 and t2 are fixed. Also, x(t1 ) is fixed, but (unlike in Section 3.2)
x(t2 ) is unconstrained. Let h(t) be an arbitrary C 1 function such that h(t1 ) = 0. It can be
derived (don’t you do it) that
Z t2
d d
J(x + εh) = fx − fẋ hdt + fẋ (t2 , x(t2 ), ẋ(t2 ))h(t2 ).
dε ε=0 t1 dt
Assume xo is locally optimal. Then the first term on the right-hand side equals zero from
Theorem 3.1. It follows that
fẋ (t2 , xo (t2 ), ẋo (t2 ))h(t2 ) = 0.
This must hold for every h(t2 ). It then follows that fẋ (t2 , xo (t2 ), ẋo (t2 )) must equal zero.
The Problem Using the theory just presented, solve the brachistochrone problem (minimum-
time path) where A is at the origin but the point B can be any point on the vertical line shown:
1
A x
y B
√
5. For the brachistochrone problem, derive the formula t1 = θ1 / g. (I think the formula is
correct.)
6. Solve the problem of moving a point p(t) in the plane from p(0) = (1, 1) to p(2) = (0, 0) while
minimizing the average velocity squared
1 2
Z
kṗ(t)k2 dt.
2 0
Chapter 5
The Maximum Principle
Reference: Optimal Control, Athans and Falb, 1966.

This chapter introduces the maximum principle, which was first presented in the famous book
The Mathematical Theory of Optimal Processes, L.S. Pontryagin, V.G. Boltyanskii, R.V.
Gamkrelidze, and E.F. Mischenko, Wiley, New York, 1962
We’ll do only a very brief introduction to this approach. You should see Athans and Falb for
the wealth of types of problems that can be formulated. Here we do just two special problems to
illustrate. Proofs are omitted.
5.1 The Double Integrator

We begin with an example, a kind for which the maximum principle is useful. A cart of mass M
moves on wheels in a straight line. The position of the cart is y and a force u is available to control
the motion. Thus, if there’s no other force, such as friction, then
M ÿ = u.
We say this system is a double integrator, because y is proportional to the double integral of u. In
fact, we might as well take M = 1 (by redefining u to be u/M ). The problem is to drive the cart
from any (y(0), ẏ(0)) to (y = 0, ẏ = 0) in minimum time. This makes sense only if u is bounded,
say |u(t)| ≤ 1.
Why can’t we use the calculus of variations on this problem?
We shall see that the optimal control signal uo is always with +1 or −1. That is, for every t
we have either uo (t) = 1 or uo (t) = −1. Such a control is said to be a bang-bang control law—it
jumps from one extreme value to the other. Let’s use that knowledge and continue.
Take the natural state model x = (y, ẏ):

0 1 0
ẋ = Ax + Bu, A = , B= .
0 0 1
For u = +1, ẏ = 1 and y is therefore increasing. This graph shows the vector field and two
trajectories—one starting at the origin and the other starting at (1, −1):
67
CHAPTER 5. THE MAXIMUM PRINCIPLE 3
68
-5 -4 -3 -2 -1 0 1 2 3 4 5
-1
-2
And for u = −1 the graph is this: 2.4
-3
1.6
0.8
-4.8 -4 -3.2 -2.4 -1.6 -0.8 0 0.8 1.6 2.4 3.2 4 4.8
-0.8
-1.6
-2.4
This leads to the switching curve: The part in the second quadrant is the trajectory backward in
time from the origin with u = −1; the part in the fourth quadrant is the trajectory backward in
time from the origin with u = 1: 2.4
1.6
0.8
-4.8 -4 -3.2 -2.4 -1.6 -0.8 0 0.8 1.6 2.4 3.2 4 4.8
-0.8
-1.6
-2.4
Every optimal trajectory has at most one switch, to get onto this switching curve. Here’s the
optimal trajectory starting in the first quadrant: 2.4
1.6
0.8
-4.8 -4 -3.2 -2.4 -1.6 -0.8 0 0.8 1.6 2.4 3.2 4 4.8
-0.8
-1.6
-2.4
5.2 Two Special Problems

First Special Problem: Fixed Initial State, Final State, and Final Time
We consider nonlinear state models of the form
ẋ = f (x, u).
It is assumed that f is continuously differentiable in its two arguments.

We assume that t1 and the boundary states x(0) = x0 and x(t1 ) = v are fixed. Furthermore, u
is required to be piecewise continuous and to satisfy the constraint u(t) ∈ Ω, t ∈ [0, t1 ], where the
set Ω is fixed. Typically it is a compact set. Finally, the cost to be minimized is
Z t1
J(u) = L(x(t), u(t))dt,
0
where L is continuously differentiable. Of course J is also a function of x0 , v, t1 , f , but these are

not variables in the problem—they are fixed and only u is a variable for design.
Now to the solution in the form of the maximum principle. Define the Hamiltonian, the function
H(x, u, λ) = L(x, u) + λT f (x, u), H : Rn × Rm × Rn −→ R.
Theorem 5.1 Assume a local optimum uo exists and let xo be the corresponding state. Then the
following conditions hold:
1. There exists λo such that xo , λo satisfy
ẋo = Hλ (xo , uo , λo )T = f (xo , uo ), xo (0) = x0 , xo (t1 ) = v

λ̇o = −Hx (xo , uo , λo )T .
2. For each t ∈ [0, t1 ]
H[xo (t), uo (t), λo (t)] = min H[xo (t), u, λo (t)].

u∈Ω
3. For each t ∈ [0, t1 ]
H[xo (t), uo (t), λo (t)] = 0.
The form of the theorem is actually a minimum principle. This is because the problem studied
is one of minimization. The name “Maximum Principle” is used nevertheless. Notice that the
theorem gives a necessary condition for a local optimum.
Example Let the state space be R2 and the state equation
ẋ = u.
This represents a point, in the plane, under velocity control. We want to steer the state from
x(0) = 0 to x(1) = v, a given target vector, while minimizing the control energy
Z 1
J(u) = ku(t)k2 dt,
0
and subject to the constraint ku(t)k ≤ 1. We expect the optimal state trajectory to be a straight
line.
The Hamiltonian is
H(x, λ, u) = kuk2 + λT u.
The state and co-state equations are
ẋo = uo , xo (0) = 0, xo (1) = v

λ̇o = 0.
Thus λo is a constant. The third condition in the theorem leads to uo (t) + λo (t) = 0, and hence
the optimal u is a constant vector too: uo (t) = c1 . Thus the problem is solvable only if kc1 k ≤ 1.
Assuming this inequality, we have from
ẋo = c1 , xo (0) = 0, xo (1) = v
that xo (t) = vt and uo (t) = v.

To recap, the problem has a solution only if kvk ≤ 1. If an optimal control exists, it equals
o
u (t) = v, and thus is unique. Finally, it is easily shown that this controller is optimal.
Second Special Problem: Linear System, Minimum Time

The plant model is linear time-invariant:
ẋ = Ax + Bu.
Controllability of (A, B) is assumed. The goal is to drive the state from x(0) = x0 to x(t1 ) = 0 in
minimum time t1 . The control signal u is required to be piecewise continuous and to satisfy the
constraint u ∈ Ω, defined as the unit cube, i.e.,
u ∈ Ω iff (∀i)|ui | ≤ 1.
For this minimum-time problem the functional to be minimized is

Z t1
dt = t1 .
0
Thus the Hamiltonian is
H(x, u, λ) = 1 + λT (Ax + Bu).
Theorem 5.2 Assume a local optimum uo and to1 exist and let xo be the corresponding state. Then
the following conditions hold:
1. There exists λo such that xo , λo satisfy
ẋo = Hλ (xo , uo , λo )T = Axo + Buo , xo (0) = x0 , xo (to1 ) = 0

λ̇o = −Hx (xo , uo , λo )T = −AT λo .
2. For each t ∈ [0, to1 ]
H[xo (t), uo (t), λo (t)] = min H[xo (t), u, λo (t)].

u∈Ω
3. For each t ∈ [0, to1 ]
H[xo (t), uo (t), λo (t)] = 0.
The second condition in the theorem reduces to
λo (t)T Buo (t) ≤ λo (t)T Bu, t ∈ [0, to1 ], u ∈ Ω.
This implies that uoi (t), the ith component of the optimal control, equals −1 if the ith component
of B T λo (t) is positive and equals +1 if the ith component of B T λo (t) is negative. Thus the optimal
control is confined to the set of vertices of Ω. This is called bang-bang control.
Example We return to the double integrator. The co-state equation is
λ̇o1 = 0
λ̇o2 = −λo1 .
The solution is
λo1 (t) = c1 , λo2 (t) = −c1 t + c2 .
Since B T λ = λ2 , we get that uo (t) equals −1 if λo2 (t) is positive and equals +1 if λo2 (t) is negative.
We conclude that the optimal control signal equals ±1 and switches at most once. That’s all the
theorem gives us. More details, such as the equation of the switching curve, have to be derived as
in the first section of the chapter.
5.3 Problems
1. Think of the bang-bang optimal control signal for the double integrator. Discuss how it
could be implemented. What sensors would be required? Is it a feedback controller? Is this
controller robust to sensor noise and modeling errors?
Chapter 6
Dynamic Programming
Dynamic programming (DP) is a clever approach to certain types of optimization problems. It was
developed by Richard Bellman and made popular in his book.
6.1 Examples
Example Let {x1 , . . . , xn } be a finite sequence of real numbers and consider the problem mini xi
of finding the minimum. If asked to write a program to solve this, you would undoubtedly write
this to compute the minimum, a:
a = x1 ;
for i = 2 : n
a = min(a, xi );
end
The DP method is exactly the same except in reverse order. Define the value function
V (i) = min {xi , . . . , xn },
that is, V (i) is the minimum “cost-to-go” starting at xi . The value V (1) is what we seek.
Of course, V (n) = xn . Suppose we know V (i) for some i, 1 < i < n. Then
V (i − 1) = min {xi−1 , . . . , xn }
= min {xi−1 , V (i)}.
Thus the DP algorithm is
V (n) = xn
for i = n − 1, n − 2, . . . , 1: V (i − 1) = min {xi−1 , V (i)}
Thus the minimization problem is a recursion of small minimization problems over just pairs of
numbers. There are n − 1 compare operations, and so the complexity is linear.
72
CHAPTER 6. DYNAMIC PROGRAMMING 73
Example Another application of DP is to find a minimum-cost path through a graph. Consider

this graph:
n01 n11 n21 n31
n12 n22
n13 n23
The nodes are labeled nij , where i is interpreted as the stage and j as the node number at that
stage. Thus there’s one node at stage 0, three nodes at stage 1, etc. One wants to travel from the
start node n01 to the end node n31 with minimum cost. Each link has a cost, labeled like this (not
all are shown):
c011 c111
c012 c112
c013
Thus, ckij is the cost from node i at stage k to node j at stage k + 1. The cost of a path is defined
to be the sum of the costs of the links.
We define the value function, a real-valued function of the nodes, as follows: V (nij ) is the
minimum cost to go from node nij to the end node. The value function at stage 3 is obviously 0.
Thus V (n31 ) = 0. The value function at stage 2 is obviously just the cost of the last link:
V (n21 ) = c211 , V (n22 ) = c221 , V (n23 ) = c231 .
We label these at the nodes:

n11 V (n21 ) V (n31 )
V (n22 )
V (n23 )
Now to the value function at stage 1. We will invoke the so-called principle of optimality:
Consider an optimal path from n01 to n31 ; if this path goes through node n1j at stage 1, then the
subpath from node n1j to n31 is optimal too. That is, for every optimal path, the cost-to-go is
minimum at each point along the path. Note that we’re not saying the initial subpath is optimal,
but rather the cost-to-go is. Thus at node n11 , since there are just three links out, we have
V (n11 ) = min {c111 + V (n21 ), c112 + V (n22 ), c113 + V (n23 )}.
After the other values are computed at stage 1, one computes V (n01 ), which equals the minimum
cost path from start to end. After the value function is computed at every node, it’s easy to find
optimal paths by moving left to right.
6.2 The Hamilton-Jacobi-Bellman Equation

We now illustrate the DP approach by looking at the linear-quadratic regulator (LQR) problem.
Consider the plant equation
ẋ = Ax + Bu, x(0) = x0
and the cost functional

Z ∞
J(x0 , u) = x(t)T Qx(t) + u(t)T Ru(t)dt.
0
The arguments of J are the initial state x0 and the input signal u. Implicitly, u is such that J is
finite. This problem is a special case of
ẋ = f (x, u), x(0) = x0

Z ∞
J(x0 , u) = L[x(t), u(t)]dt.
0
So let’s do this more general case and then specialize.
Introduce the value function, the optimal cost as a function of arbitrary initial time and state:
Z ∞
V (τ, ξ) = min L(x, u)dt : x(τ ) = ξ .
u τ
The argument t in the integrand has been dropped for convenience. Because A, B, Q, R are constant
matrices and the upper limit on the integral is ∞, you can check that V (τ, ξ) is independent of τ ,
that is, V is a function of only ξ in this instance. Nevertheless, we’ll keep the two arguments in
order to get the general HJB equation.
For any δτ > 0 we have
Z ∞ Z τ +δτ Z ∞
L(x, u)dt = L(x, u)dt + L(x, u)dt.
τ τ τ +δτ
Let uτ denote the piece of u defined over (τ, τ + δτ ) and uτ the piece of u defined over (τ + δτ, ∞).
R τ +δτ R∞
Then the term τ L(x, u)dt is a function of x(τ ) and uτ , while τ +δτ L(x, u)dt is a function of
x(τ + δτ ) and uτ ; but x(τ + δτ ) is a function of x(τ ) = ξ and uτ . Minimizing over u we get
Z ∞
V (τ, ξ) = min L(x, u)dt
u τ
Z τ +δτ Z ∞
= min min L(x, u)dt + L(x, u)dt
uτ uτ τ τ +δτ
Z τ +δτ Z ∞
= min L(x, u)dt + min L(x, u)dt
uτ τ uτ τ +δτ
Z τ +δτ
= min L(x, u)dt + V (τ + δτ, x(τ + δτ )) .
uτ τ
Now we let δτ approach 0. To first order in δτ we have
x(τ + δτ ) = ξ + δτ f (ξ, u(τ ))
and therefore
∂V ∂V
V (τ + δτ, x(τ + δτ )) = V (τ, ξ) + δτ (τ, ξ) + δτ (τ, ξ)f (ξ, u(τ )).
∂τ ∂x
Also,
Z τ +δτ
L(x, u)dt = δτ L[ξ, u(τ )].
τ
Thus we have

∂V ∂V
V (τ, ξ) = min δτ L[ξ, u(τ )] + V (τ, ξ) + δτ (τ, ξ) + δτ (τ, ξ)f (ξ, u(τ ))
u(τ ) ∂τ ∂x
and hence

∂V ∂V
0 = min L[ξ, u(τ )] + (τ, ξ) + (τ, ξ)f (ξ, u(τ )) .
u(τ ) ∂τ ∂x
In this equation, ξ and u(τ ) are dummy variables. Let’s replace them by x ∈ Rn and u ∈ Rm :

∂V ∂V
min L(x, u) + (τ, x) + (τ, x)f (x, u) = 0.
u ∂τ ∂x
We arrive at the Hamilton-Jacobi-Bellman (HJB) equation:

∂V ∂V
(τ, x) + min L(x, u) + (τ, x)f (x, u) = 0.
∂τ u ∂x
As mentioned at the start of the derivation, V (τ, x) is a function only of x: V (x). So in this
time-invariant case the HJB equation is

dV
min L(x, u) + (x)f (x, u) = 0.
u dx
The derivation of this equation wasn’t rigorous. What we have is an equation satisfied by an
optimal control law, u as a function of x, and the value function, V (x), under certain conditions, if
an optimal control exists. What to do with the equation? Solve the minimization problem on the
left-hand side for u as a function of x and dV /dx; then equate the left-hand side to zero.
Example A very simple example is

Z ∞
ẋ = x + u, J(x0 ) = x2 + u2 dt.
0
That is,
f (x, u) = x + u, L(x, u) = x2 + u2 .
Letting Vx denote dV /dx, we have the HJB equation
min x2 + u2 + Vx (x + u) = 0.

u
To do the minimization, differentiate with respect to u and set the derivative to zero:
2u + Vx = 0.
Thus u = −Vx /2. Substitute this into
x2 + u2 + Vx (x + u) = 0
and solve for Vx :

√
Vx = 2(1 ± 2)x.
Thus
1 √
u = − Vx = −(1 ± 2)x.
2
Only the solution
√
u = −(1 + 2)x
is valid; the other doesn’t yield J(x0 ) < ∞.
Now we return to the full LQR problem. We have
f (x, u) = Ax + Bu, L(x, u) = xT Qx + uT Ru,
where Q is positive semidefinite and R positive definite. The HJB equation is
min xT Qx + uT Ru + Vx (Ax + Bu) = 0.

u
The minimizing u satisfies
2uT R + Vx B = 0,
and so
1
u = − R−1 B T VxT .
2
Substituting into
xT Qx + uT Ru + Vx (Ax + Bu) = 0
gives
1
xT Qx + Vx Ax − Vx BR−1 B T VxT = 0.
4
Now we somehow have to find a V (x) satisfying this equation and such that u makes J finite. It
turns out that a quadratic function will work: V (x) = xT P x. Substituting Vx = 2xT P into the
equation gives
xT Qx + 2xT P Ax − xT P Vx BR−1 B T P x = 0,
or equivalently, to get a symmetric matrix,
xT Qx + xT P Ax + xT AT P x − xT P Vx BR−1 B T P x = 0.
This can be written as
xT (Q + P A + AT P − P BR−1 B T P )x = 0
and this leads to the Riccati equation:
Q + P A + AT P − P BR−1 B T P = 0.
It remains to study when this equation has a solution P such that J is finite for
u = −R−1 B T P x.
We’ll do this in a later chapter.
Let’s summarize what we’ve done. We assumed an optimal control law exists and we derived a
formula for it, but we don’t know when it is valid, that is, we don’t know if the Riccati equation
has a solution, and, if it does, we don’t know if J is finite. This is typical of DP. It provides an
existence condition but you still have to do a lot of work.
6.3 Problems
1. Find the minimum cost path.
5 4 1
2 3 3 4 2
1 2
start 2 finish
2 1 3 3 3
3 2 3 2
3 4
4 1 1
5 5
2. Discrete-time LQR:
x(k + 1) = Ax(k) + Bu(k), x(0) = x0

X
x(k)T Qx(k)T + u(k)T Ru(k)
k≥0
Find the HJB equation.

Part II
More Recent Theories
79
Preamble to Part II
In the next three chapters we formulate optimal control problems in terms of signal norms. This
leads to function spaces and operators on them. This is a branch of mathematics called functional
analysis. Here we introduce and motivate the main ideas.
The performance of a system should be measured by norms, for example, how large a tracking
error is, or how large a control signal is. So let us begin with some familiar terms for deterministic
signals. For a sinusoidal signal x(t) = A cos(ωt + φ), the zero-to-peak value is |A|. We write this as
kxk∞ and call it the infinity norm: kxk∞ = maxt |x(t)|. In an electric circuit, the power dissipated
in a resistor at time t is i(t)2 R and the energy dissipatedRis the integral of this over time. Extending
this, we shall think of the energy of a signal x(t) as x(t)2 dt and we shall write this as kxk22 ,
the square of the 2-norm. Thus for a signal x(t), there are two norms: kxk∞ , the zero-to-peak
value, and kxk2 , the square-root of the energy. Thus we have two norms to measure signal size for
deterministic signals.
Then there are random signals. Let x be a zero-mean random variable. Its root-mean-square
(rms) value qualifies as a norm: kxk = (E x2 )1/2 . This extends to a random vector: The norm is
kxk = (E xT x)1/2 , which can also be written kxk = Tr (E xxT )1/2 , where Tr denotes trace. This
extends to zero-mean stationary random signals.
Now we turn to norms of systems. To get a glimpse of this concept, consider the equation
y = Au, where u, y are vectors and A is a matrix. We shall think of this equation as defining a
system—input u, output y. We are going to define two norms for this system: The first is for a
specific input, and the second is what is called an induced norm.
For the first system norm, suppose u is a zero-mean white random vector, that is, its covariance
matrix equals the identity matrix: E uuT = I. The term “white” refers to the fact that two different
components of u are uncorrelated. The covariance matrix of y equals AAT and therefore the norm
of y equals Tr AAT , or equivalently, Tr AT A. This motivates introducing the Frobenius norm of a
matrix:
1/2
kAkF = Tr AT A .
1/2
You can check that this equals Σi,j a2ij . Thus, the Frobenius norm of A equals the rms output
when the input is the standard white vector.
The second system norm is defined by saying that its square equals the maximum output energy
when the input energy equals 1:
sup{kAxk2 : kuk2 = 1}.
It is a fact that this induced norm equals σmax (A), the largest singular value of A.
80
Summary of Spaces
Time domain, discrete time, scalar valued

The domain of all these functions is Z, the integers. The time set could also be the non-negative
integers. We could write `(Z) etc., but we shall let context determine this.
` The space of all signals Z −→ R.
This is a vector space. The time set could also be the non-negative integers.
`2 The subspace of ` of square summable, i.e., finite P
energy, signals.
A Hilbert space under the inner product hx, yi = n x[n]y[n].
2 1/2 .
P
The norm is kxk2 = n x[n]
`∞ The subspace of ` of bounded signals.
A Banach space under the norm kxk∞ = supn |x[n]|.
`1 The subspace of ` of absolutely summable Psignals.
A Banach space under the norm kxk1 = n |x[n]|.
cf d The subspace of ` or `2 or `∞ of signals that are of finite duration.
Time domain, continuous time, scalar valued

The domain of all these functions is R, the set of real time values. The time set could also be the
non-negative reals.
L The space of all signals R −→ R.
L2 The subspace of L of square integrable, i.e., finiteRenergy, signals.
A Hilbert space under the inner product hx, yi = t x(t)y(t)dt.
1/2
The norm is kxk2 = t x(t)2 dt
R
.
L∞ The subspace of L of essentially bounded signals,
i.e., bounded except perhaps on a set of measure zero.
A Banach space under the norm kxk∞ = ess sup |x(t)|.
L1 The subspace of L of absolutely summable R signals.
A Banach space under the norm kxk1 = t |x(t)|dt.
Frequency domain, discrete time, scalar valued

These functions are complex-valued and their domain is the unit circle, and sometimes beyond.
81
82
L2 The space of square integrable complex-valued functions X(ejω ).

1
Rπ jω
A Hilbert space under hX, Y i = 2π jω
−π X(e )Y (e )dω.
R 1/2
1 π jω 2

The norm is kXk2 = 2π −π X(e ) dω .
L2 is the space of DTFTs (discrete-time Fourier transforms) of `2 .
H2 The space of z-transforms of those functions in `2 that are 0 for n < 0.
This is a closed subspace of L2 , so it has the same inner product.
The functions in H2 are analytic in the unit disk, |z| < 1.
L∞ The space of essentially bounded complex-valued functions X(ejω ).
A Banach space under kXk∞ = ess sup |X(ejω )|.
H∞ The space of functions that are analytic and bounded in the unit disk.
This is a closed subspace of L∞ , so it has the same norm.
Frequency domain, continuous time, scalar valued

These functions are complex-valued and their domain is the imaginary axis.
L2 The space of square integrable complex-valued functions X(jω).
1
R∞
A Hilbert space under hX, Y i = 2π −∞ X(jω)Y (jω)dω.
R 1/2
1 ∞ 2
The norm is kXk2 = 2π −∞ |X(jω)| dω .
L2 is the space of FTs (Fourier transforms) of L2 .
H2 The space of Laplace transforms of those functions in L2 that are 0 for t < 0.
This is a closed subspace of L2 , so it has the same inner product.
The functions in H2 are analytic in the right half-plane, Re s > 0.
L∞ The space of essentially bounded complex-valued functions X(jω).
A Banach space under kXk∞ = ess sup |X(jω)|.
H∞ The space of functions that are analytic and bounded in the right half-plane.
This is a closed subspace of L∞ , so it has the same norm.
Chapter 7
Introduction to Function Spaces
7.1 Hilbert Space and Banach Space

Norms
Everyone knows what a vector space is. Unless we say otherwise, the scalars associated with the
vector space will be real numbers. Our vector spaces will frequently not be finite dimensional. An
example is the space C[t1 , t2 ] of real-valued continuous functions x(t) defined on the time interval
[t1 , t2 ]. It’s possible to define a norm on this space, for example,
kxk∞ = max |x(t)|.

t1 ≤t≤t2
But this isn’t the only possibility; another is

Z t2 1/2
2
kxk2 = |x(t)| dt .
t1
There are three properties a norm must have:
kxk ≥ 0 and kxk = 0 ⇐⇒ x = 0
kcxk = |c|kxk
kx + yk ≤ kxk + kyk.
A vector space with a norm is a normed space.
Inner products
The space Rn has the inner product hx, yi, also written as a dot product. If x, y are regarded as
column vectors, then the inner product can also be written xT y. Finally, the Euclidean norm can
be defined in terms of the inner product:
kxk = hx, xi1/2 .
In a general vector space X , an inner product hx, yi must have three properties: hx, yi = hy, xi;
for every y the map x 7→ hx, yi is linear; and hx, xi is positive for all nonzero x. A vector space with
83
CHAPTER 7. INTRODUCTION TO FUNCTION SPACES 84
an inner product is an inner product space. The space C[t1 , t2 ] with the norm kxk2 is an inner
product space, the inner product being
Z t2
hx, yi = x(t)y(t)dt.
t1
Completeness
Consider the set R and a sequence {an } in it. It is called a Cauchy sequence if
(∀ε)(∃N )(∀i, k)i, k > N =⇒ |ai − ak | < ε.
Evidently this sequence is “trying to converge” in the sense that the elements in the sequence are
getting closer and closer together. That it does converge in R is a feature of that set, a feature
called completeness: Every Cauchy sequence in R has a limit in R. For example, the interval
(0, 1] is not complete, because 1/n is a Cauchy sequence that converges to 0 6∈ (0, 1].
In a normed space, every convergent sequence is a Cauchy sequence (a good exercise). If,
conversely, every Cauchy sequence converges in the space, the space is said to be complete. A
complete normed space is called a Banach space; a complete inner-product space is called a
Hilbert space. The advantage of completeness is that in principle one doesn’t have to know a
limit to test if a sequence converges.
The space C[t1 , t2 ] with the norm kxk∞ is a Banach space, while C[t1 , t2 ] with the norm kxk2 is
not. To see this latter fact, note that the sequence
x1
x2
x3 . . .
t1 t2
is a Cauchy sequence in the norm kxk2 , but it converges to a step function, which is not continuous
and therefore not in C[t1 , t2 ]. It is not a Cauchy sequence in the norm kxk∞ .
Consider again C[t1 , t2 ] with the norm kxk∞ . The set of polynomial functions of t is a subset,
P[t1 , t2 ], of C[t1 , t2 ]. It is a subspace, that is, it is a vector space itself, but it is not complete. For
example, the function x(t) = sin t belongs to C[t1 , t2 ]; if xn (t) denotes the truncation at the nth
term of the Taylor series of x(t), then xn ∈ P[t1 , t2 ]; also, xn (t) converges to x in the sense that
lim kx − xn k∞ = 0.
n→∞
The closure of P[t1 , t2 ], denoted P[t1 , t2 ], is defined to be the limits of all sequences in P[t1 , t2 ]
that converge in C[t1 , t2 ]. The Weierstrass approximation theorem says that
P[t1 , t2 ] = C[t1 , t2 ],
that is, any continuous function on a closed interval can be approximated uniformly by a polyno-
mial. Now let’s enlarge C[t1 , t2 ]. There are certainly discontinuous functions that are bounded and
therefore for which the norm kxk∞ is finite. We are going to call this class of functions L∞ [t1 , t2 ].
1 With appropriate consideration, L∞ [t , t ] is a Banach space.
1 2
On the other hand, consider C[t1 , t2 ] with the norm kxk2 . As we saw, it’s not complete. But
it can be embedded in a complete normed space, which is denoted L2 [t1 , t2 ]. The construction of
this completion is somewhat involved, hence we omit it. For us, it’s good enough to accept that
L2 [t1 , t2 ] contains functions x for which there is a sequence {xn } of continuous functions, or even
polynomials, such that
lim kx − xn k2 = 0.
n→∞
Thus L2 [t1 , t2 ] is a Hilbert space.

More generally, if the time set is all real t, we write L2 (R). Indeed, we can view L2 [t1 , t2 ] as the
subspace of L2 (R). of functions zero for t not in [t1 , t2 ]. Likewise for L2 [0, ∞). If the time set is
known or irrelevant, we may write just L2 .
A different example, from the world of discrete-time signals, is `2 . The elements of this space
are square-summable discrete-time signals, x(k), k ≥ 0, x(k) ∈ R. The inner product is
X
hx, yi = x(k)y(k).
k
Consider the subset cdn of signals of duration n, that is, x(k) equals zero for k > n. It’s routine to
prove that cdn is a subspace of the vector space `2 , and that furthermore it is a closed set within
`2 . Being closed, cdn is complete, and therefore is a Hilbert space. Now consider the subset cf d of
signals of finite duration, that is, x(k) equals zero within a finite time:
cf d = {x : (∃n)(∀k > n)x(k) = 0}.
Again, cf d is a subspace of the vector space `2 . However cf d is not a closed set within `2 . To see
this, note that the sequence xi ,
1/2k , k ≤ i

xi (k) =
0, k>i
belongs to cf d , and it converges in `2 to the signal

1
x(k) = ,
2k
but x doesn’t belong to cf d . In fact, the closure of cf d in `2 is the space `2 itself.
Vector-valued functions
Now we turn to vector-valued signals. Let x(t) denote a function where t is a real number and x(t)
is an n-dimensional real vector. The L2 -norm of x is defined to be
Z ∞ 1/2
2
kxk2 = kx(t)k dt .
−∞
1
Actually, we’re glossing over a subtle point. Let x(t) be the function defined on the interval [0, 1] by saying that
x(t) = n for t = 1/n, n ≥ 1, and x(t) = 0 otherwise. Sketch the graph of this function. It is unbounded but is zero
except at a countable number of points. We say that x equals zero almost everywhere and we set kxk∞ = 0.
The norm kx(t)k is the Euclidean norm of the vector x(t). Usually it is irrelevant what the dimension
is so we continue to write the space as L2 (R).
For a bounded signal the norm is
kxk∞ = sup kx(t)k.
t
Again, the right-hand norm is the Euclidean norm. We write L2 (R) for the class of such functions.
We have just seen the definition of Hilbert space: a complete inner product space. Thus Hilbert
space is an abstract concept for which there are many instances, L2 [t1 , t2 ] being one. This idea of
defining an abstract concept is very common in mathematics because one can get a general result
that applies in many instances.
In a classification of spaces, Hilbert and Banach spaces sit here:
(analysis) set (algebra)
topological space
metric space vector space
normed space
Banach space inner product

space
Hilbert space
The notion of orthogonality allows optimization to be done by orthogonal projections. A famous

example of this is the Kalman filter.
7.2 The Projection Theorem

Let’s begin with some properties of the inner product. The first is the Cauchy-Schwarz inequality.
Lemma 7.1 In an inner-product space

|hx, yi| ≤ kxkkyk.
Proof If y = 0 then both sides of the inequality equal 0. So now assume y 6= 0. Define
hx, yi
c= .
kyk2
Then
0 ≤ kx − cyk2
= hx − cy, x − cyi
= kxk2 − chy, xi − chx, yi + c2 kyk2 .
The second and fourth terms cancel. Thus
chx, yi ≤ kxk2 ,
and so
hx, yi2 ≤ kxk2 kyk2 .
The second is the parallelogram equality.
Lemma 7.2 In an inner-product space
kx + yk2 + kx − yk2 = 2(kxk2 + kyk2 ).
The proof is left as an exercise. The picture going with the lemma is this:
y x+y
x−y
x
In an inner product space, we say x, y are orthogonal and write x ⊥ y if hx, yi = 0. For
example, ejnt ⊥ ejmt in L2 [0, 2π]. Here, the field of scalars is C and the inner product is
Z 2π
hx, yi = x(t)y(t)dt.
0
A set V in a vector space is convex if, whenever x, y ∈ V, all points on the line from x to y are
in V, i.e.,
(∀x, y)(∀λ)x, y ∈ V, 0 ≤ λ ≤ 1 =⇒ λx + (1 − λ)y ∈ V.
Theorem 7.1 Let X be an inner product space, V a complete convex subset, and x ∈ X . There is
a unique vector in V that is closest to x.
Proof Define
δ = inf kx − vk.
v∈V
Then there is a sequence {vn } in V such that kx − vn k converges to δ. If we can show that {vn } is
a Cauchy sequence, then, because V is complete, v o = limn vn exists, belongs to V, and is closest to
x.
To prove the sequence is a Cauchy sequence, by the parallelogram equality we have
k(vn − x) + (x − vm )k2 + k(vn − x) − (x − vm )k2 = 2(kvn − xk2 + kx − vm k2 ).

Therefore
kvn − vm k2 = 2kvn − xk2 + 2kx − vm k2 − 4k(vn + vm )/2 − xk2 .
By convexity (vn + vm )/2 ∈ V and so
k(vn + vm )/2 − xk ≥ δ.
Therefore
kvn − vm k2 ≤ 2kvn − xk2 + 2kx − vm k2 − 4δ 2 .
The right-hand side is arbitrarily small for n, m sufficiently large. This proves the Cauchy property.
Finally, uniqueness is proved like this. Suppose
kx − v o k = kx − vk = δ.
As in the preceding inequality,
kv o − vk2 ≤ 2kv o − xk2 + 2kx − vk2 − 4δ 2 = 0.
Thus v o = v.
The preceding result is not true in general in a normed space, as shown in an exercise.
A subspace V in a Hilbert space X may not be closed, as we saw. Its orthogonal complement,
denoted V ⊥ , is the set of vectors orthogonal to every vector in V. The set V ⊥ is a subspace and it
is closed. If V is closed, then
X = V ⊕ V ⊥,
which means that every vector x can be written uniquely as v + w, v ∈ V, w ∈ V ⊥ .

Now for the projection theorem. It is trivial to prove that a subspace of a vector space is convex.
Theorem 7.2 Let X be a Hilbert space and V a closed subspace. Let x ∈ X and let v o be the vector
in V that is closest to x. Then x − v o ⊥ V.
Proof Suppose not. Then there is a vector v in V of unit norm and such that
hx − v o , vi = c 6= 0.
Then
kx − (v o + cv)k2 = k(x − v o ) − cvk2

= kx − v o k2 − 2chx − v o , vi + c2
= kx − v o k2 − c2
< kx − v o k2 .
This contradicts that v o is closest to x.

Example Consider a simple first-order model of a motor:
θ̈ + θ̇ = u.
The state is x = (θ, θ̇). Suppose the control objective is to drive the state from x(0) = (0, 0) to
x(1) = (1, 0) using minimum energy, that is,
Z 1
kuk2 = u(t)2 dt
0
should be minimum. Thus the optimization space is U = L2 [0, 1].

Let us turn the objective x(1) = (1, 0) into a constraint on u. We have
Z t Z t
θ(t) = u(τ )dτ − e−(t−τ ) u(τ )dτ
0 0
Z t
θ̇(t) = e−(t−τ ) u(τ )dτ.
0
Define
v1 (t) = 1 − et−1 , v2 (t) = et−1 .
Then the problem is to minimize kuk2 subject to
hv1 , ui = 1, hv2 , ui = 0.
The vectors v1 , v2 are linearly independent. Let V denote their span. Since V is finite-dimensional,
it is closed. Let W denote the set of control signals driving the state to the desired point:
W = {u : hv1 , ui = 1, hv2 , ui = 0}.
If u ∈ W, then every vector of the form u + p, p ⊥ V, belongs to W. Thus the picture looks like
this:
W
⊥
V
It follows that the optimal u lies at the intersection of V and W. So write u = c1 v1 +c2 v2 . Substitute
this into the constraint equations and solve for c1 , c2 :
1
uo (t) = (1 + e − 2et ).
3−e

7.3 Operators
Let X , Y be normed spaces and let T : X −→ Y be a linear function; function, mapping, transfor-
mation are synonymous. We say T is bounded if
(∃b)(∀x)kT xk ≤ bkxk.
The least bound b for which this inequality holds is called the norm of T , denoted kT k. It is a fact
that boundedness and continuity are equivalent.
A good example is a BIBO stable system. For example, consider the LTI system with transfer
function
1
G(s) =
s+1
and let T denote the time-domain mapping from input to output. If we take X , Y both to be the
space of bounded continuous functions on the time interval [0, ∞), with norm
kxk∞ = sup |x(t)|,

t
then T is bounded. In fact, the norm of T equals

Z ∞
|g(t)|dt,
0
where g(t) is the inverse Laplace transform of G(s).

A bounded linear map is called an operator. The set of operators T : X −→ Y is denoted
B(X , Y). It too is a normed space, and if X , Y are Hilbert spaces, B(X , Y) is a Banach space.
Example Consider the state-space system
ẋ = Ax + Bu, x(0) = 0.
Suppose dim x = n, dim u = m. Fix a time, say t = 1, and consider the map T from u to x(1). Let
us take the domain of T , denoted U, to be the Hilbert space L2 [0, 1], that is, m-dimensional vectors
u(t) each of whose components lives in L2 [0, 1]. The inner product on U is
Z 1
hu, vi = u(t)T v(t)dt.
0
The co-domain of T is Rn . It’s not hard to show that T is bounded. That it is bounded, even
though A may not be stable, is because the time interval is finite. Nothing very bad can happen in
finite time. We’ll see later what the norm of T is.
Now restrict X , Y to be Hilbert spaces. It is a very important fact that T : X −→ Y has an

adjoint. This is an operator T ∗ : Y −→ X , in the reverse direction, satisfying the equation
hT x, yi = hx, T ∗ yi.
Example, continued In the equation
hT u, xi = hu, T ∗ xi
the left-hand inner product is in Rn and the right-hand one is in L2 [0, 1]. We have
Z 1 T Z 1
A(1−t) T (1−t)
hT u, xi = e Bu(t)dt x= u(t)T B T eA xdt.
0 0
Denoting T ∗ x by v, we have
Z 1
∗
hu, T xi = u(t)T v(t)dt.
0
Equating the two right-had sides, we have

T (1−t)
v(t) = B T eA x.
Thus T ∗ is the mapping x 7→ v given by

T (1−t)
(T ∗ x)(t) = v(t) = B T eA x.
The image of T , denoted Im T , is the set of all vectors T x as x ranges over all X . The image
is a subspace of Y, though it may not be closed. The kernel of T , denoted Ker T , is the set of all
vectors x such that T x = 0. The kernel is a closed subspace of X .
Lemma 7.3 (Im T )⊥ = Ker T ∗
Proof
y ∈ (Im T )⊥ ⇔ (∀v ∈ Im T )hv, yi = 0

⇔ (∀x)hT x, yi = 0
⇔ (∀x)hx, T ∗ yi = 0
⇔ T ∗y = 0
⇔ y ∈ Ker T ∗
7.4 A Minimization Problem

Let X , Y be Hilbert spaces and T : X −→ Y an operator.
Theorem 7.3 The vector xo minimizes kT x − yk iff T ∗ T xo = T ∗ y.
Proof (Necessity) Suppose xo minimizes kT x − yk, that is,
(∀x)kT xo − yk ≤ kT x − yk.
Define y o = T xo ∈ Im T .
It is claimed that y o − y ⊥ Im T . To prove this, following the proof of the projection theorem
suppose to the contrary that there exists y1 ∈ Im T such that
ky1 k = 1, hy − y o , y1 i = c 6= 0.
Then
ky − (y o + cy1 )k2 = k(y − y o ) − cy1 k2
= ky − y o k2 − c2
< ky − y o k2 .
Since y o + cy1 ∈ Im T , there exists x such that T x = y o + cy1 . Thus
ky − T xk < ky − T xo k.
This contradicts that xo is optimal, and proves the claim.

Thus
y o − y ∈ (Im T )⊥ = Ker T ∗ .
Hence T ∗ y o = T ∗ y, i.e., T ∗ T xo = T ∗ y.
(Sufficiency) Assume
T xo − y ∈ Ker T ∗ = (Im T )⊥ .
Then for any y1 ∈ Im T , by Pythagorus
ky1 − yk2 = ky1 − T xo k2 + kT xo − yk2 ≥ kT xo − yk2 .
Thus xo is optimal.
Continuing with the same setup, suppose the equation T x = y is solvable. If it has more than
one solution, then it has infinitely many. Suppose we’d like a solution x of minimum norm.
Theorem 7.4 Assume the equation T x = y has a solution and that T ∗ has closed image. The
vector xo minimizes kxk subject to T x = y iff xo = T ∗ z where z is any vector such that T T ∗ z = y .
Proof Fix one solution x̄ of T x = y. Then any other solution has the form x̄ − x̃ where x̃ ∈ Ker T .
Thus the problem
min kxk
T x=y
is equivalent to the problem
min kx̄ − x̃k.

x̃∈Ker T
By the projection theorem, and since Ker T is closed, the latter minimum exists and is unique: Let
it be achieved by x̃o . Define xo = x̄ − x̃o . Also by the projection theorem, xo belongs to (Ker T )⊥ ,
and thus to Im T ∗ , since it is closed. Thus xo = T ∗ z for some z. Multiplying this equation by T
gives y = T T ∗ z.
Example, continued Consider the state-space system
ẋ = Ax + Bu, x(0) = 0.
As before, let T denote the mapping from u to x(1). The domain of T , denoted U, is L2 [0, 1] and
the co-domain of T is Rn . The adjoint T ∗ is the mapping Rn −→ U given by
T (1−t)
(T ∗ x)(t) = B T eA x.
Let us pose the problem of finding the minimum norm u such that x(1) equals a prescribed target
vector. You’re asked to solve this in an exercise.
7.5 Problems
1. Prove that in a normed space every convergent sequence is a Cauchy sequence.
2. Prove the parallelogram equality in an inner product space:
kx + yk2 + kx − yk2 = 2(kxk2 + kyk2 ).
Show that the norm kxk∞ in C[0, 1] does not satisfy the parallelogram equality, and therefore
C[0, 1] with this norm is not an inner product space.
3. Prove that in Rn every subspace is closed.
4. Give an example of a function in L2 [0, ∞) that does not tend to zero as t −→ ∞.
5. Consider the system with transfer function G(s) = 1/(s + 1). If the input is in L2 [0, ∞), does
that mean the output tends to zero as t −→ ∞?
6. Consider C[0, 1] with the norm kxk∞ . Define V to be the set of functions satisfying
Z 1/2 Z 1
x(t)dt − x(t)dt = 1.
0 1/2
Show that V is closed (hence complete) and convex, but it does not have an element of
minimum norm, that is, a vector closest to 0.
7. Consider the state-space system
ẋ = Ax + Bu, x(0) = 0.
Assume (A, B) is controllable. Find u in L2 [0, 1] of minimum norm such that x(1) = v, a
given vector.
8. Let p(t) be a polynomial with real coefficients and of degree at most 1; that is, it has the form
p(t) = c0 + c1 t with c0 , c1 real numbers. The task is to approximate the quadratic t2 over the
range 0 ≤ t ≤ 1. This is expressed by saying that
Z 1
[p(t) − t2 ]2 dt
0
should be minimum. There is also a constraint on p(t), namely,

Z 1
p(t)dt = 0.
0
This problem can be formulated in L2 [0, 1] as follows. To be minimized is kp − t2 k; the

constraint is hp, 1i = 0; also p must belong to the span of {1, t}. Define
V = {v : v ∈ L2 [0, 1], v ∈ Span{1, t}, v ⊥ 1}.
Then the problem is to find p in V that is closest to t2 . Solve this problem via the projection
theorem.
9. Consider the space Rn×m of n × m matrices. There is a natural inner product, namely,
hX, Y i = trace X T Y.
The trace of a square matrix is the sum of its diagonal elements. This inner product reduces
to the usual one for vectors when m = 1. Consider the optimization problem
minimizeX kA − BXCk.
Here A, B, C are real matrices, A is n × m, B is n × p, and C is q × m. Thus X is p × q. This

is a problem of the form
minimizeX kA − T (X)k,
where T is the linear transformation given by T (X) = BXC. Derive the adjoint of T and
write down the equation for an optimal X.
10. (a) Let H2 denote the space of rational transfer functions that have real coefficients and are
strictly proper. (It’s not a Hilbert space because it is not complete, but it’s a perfectly
good inner-product space.) Examples are
1 s2
, 3
s + 1 s + 3s2 + s + 1
but not
1 1
1, , , sin s.
s−1 s+j
Show that this is an inner product space with inner product
Z ∞
1
hF, Gi = F (−jω)G(jω)dω.
2π −∞
(b) Consequently, the H2 -norm is

∞ 1/2
1
Z
2
kF k2 = |F (jω)| dω .
2π −∞
What is the corresponding norm of f (t), the inverse Laplace transform of F (s)?
(c) If you haven’t had a course in complex variables, you’ll have to look up the residue
theorem. For F, G ∈ H2 , show that hF, Gi equals the sum of the residues of F (−s)G(s)
at its poles in the left half-plane. For example, if F (s) = 1/(s + 1), then kF k22 equals the
residue of
1
F (−s)F (s) =
(−s + 1)(s + 1)
at the pole s = −1, which equals 1/2.

(d) The function U (s) = (s − 1)/(s + 1) is called an all-pass function. Draw its magnitude
Bode plot to see why.
(e) For the same U (s), show that kU F k2 = kF k2 for every F in H2 .
(f) Let
1 s−1
G(s) = , U (s) = , V ∈ H2 .
s+1 s+1
Prove that G and U V are orthogonal and so
kG + U V k22 = kGk22 + kV k22 .
(g) Let T be the operator H2 −→ H2 that maps V to U V , U as in the preceding part. Find
the adjoint operator T ∗ . What are T T ∗ and T ∗ T ? Be careful: The adjoint of T is not the
mapping V (s) 7→ U (−s)V (s); this is because U (−s) has a pole in the right half-plane,
and therefore U (−s)V (s) is not in H2 in general, even though V is in H2 .
11. Let H∞ denote the set of functions G(s) that are analytic and bounded in the open right
half-plane. For example, if G(s) is rational, then it has no poles in the closed right half-plane
and it is proper (its numerator degree is not greater than its denominator degree). On this
space define the norm
kGk∞ = sup |G(jω)|.

ω
Now let
s−1
F (s) = .
s+1
Prove that F H∞ , the set of all products F G, where G ranges over H∞ , is closed in H∞ .
12. Consider again the problem
minimizex kc − Axk.
Take
   
1 1 1
A =  2 0 , c =  1 .
3 1 −1
Solve using the projection theorem.

13. Consider the space Rn×n with the inner product
hX, Y i = Trace X T Y.
Let S denote the subspace of symmetric matrices. Find S ⊥ . Given A, find B, C such that
A = B + C, B ∈ S, C ∈ S ⊥.
14. Find the distance in the Frobenius norm from a square matrix to the nearest lower triangular
matrix.
P
15. Einstein’s summation convention is that an expression like k aik bkj is abbreviated to aik bkj .
That is, in any product expression, a repeated index (k in this case) means summation. Thus
if A and B are matrices such that the product C = AB is defined, then cij = aik bkj . Using
this convention, prove that the trace of AB equals the trace of BA (assuming AB and BA
are square).
16. Consider the vector space Rn×n with the inner product hX, Y i = traceX T Y . The set of
symmetric matrices is a subspace, S. So the problem
min kA − Bk
B∈S
is that of approximating a matrix by a symmetric matrix. Solve it.
17. Show that every linear transformation Rn −→ Rm is bounded.
18. Consider an imaginary one-dimensional straight road going off to infinity in both directions.
Model the road as R. Consider a countably infinite number of cars on the road; assume each
car has been labeled with an integer number. At any particular time, the cars could be at
any particular points on the road. Let xk (t) denote the location on the road of car k at time
t and let x(t) be the infinite vector
x(t) = (. . . , x−1 (t), x0 (t), x1 (t), . . . ).
Let’s fix t and drop the argument t in x(t). Thus x is a vector with an infinite number of
components, each of which could be any real number. Finally, let ` denote the set of all such
vectors x.
The set ` is a vector space. But it is not a normed space, and hence it cannot be a Hilbert
or Banach space. Make ` a topological vector space (you’ll have to look up the definition of
this). Modify the graph on page 86 and show where ` fits.
19. Suppose an , n ≥ 1 is a non=negative sequence in `1 . Prove that so is (an )1/2 /n, n ≥ 1.

Chapter 8
H2 Optimal Control
The symbol H2 stands for the space of all stable, strictly proper transfer functions, such as
1 2s − 1
, 2
s + 1 s + 5s + 2
but not
1 s
, .
s s+1
There’s a natural inner product (dot product) and norm:
Z ∞ Z ∞ 1/2
1 1 2
< P, Q >= P (jω)Q(jω) dω, kP k2 = |P (jω)| dω .
2π −∞ 2π −∞
Here the bar denotes complex conjugate. The space extends to matrices, as we’ll see.
In this chapter we study optimal control design in this space.
8.1 Overview
This section gives an overview of the standard H2 problem. An example illustrates how to set
problems up.
Let L(R, Rn ) denote the space of all signals from R to Rn . We review the L2 -norm of a signal
u in L(R, Rn ). For each time t, u(t) is a vector in Rn ; denote its Euclidean norm by ku(t)k. The
L2 -norm of u is then defined to be
Z ∞ 1/2
2
kuk2 = ku(t)k dt .
−∞
The space L2 (R, Rn ), or just L2 (R) if convenient, consists of all signals for which this norm is finite.
For example, the norm is finite if u(t) converges to 0 exponentially as t → ∞. (Caution: kuk2 < ∞
does not imply that u(t) → 0 as t → ∞—think of a counterexample.)
Before defining a norm for a transfer matrix, we have to deal a norm for complex matrices. Let
R be a p × m complex matrix, that is, R ∈ Cp×m . There are many possible definitions for kRk; we
need one. Let R∗ denote the complex-conjugate transpose of R. The matrix R∗ R is Hermitian and
positive semidefinite. Recall that the trace of a square matrix is the sum of the entries on the main
diagonal. It is a fact that the trace also equals the sum of the eigenvalues.
The first definition for kRk (there will be another in the next chapter) is [trace(R∗ R)]1/2 .
97
CHAPTER 8. H2 OPTIMAL CONTROL 98
Example

2+j j
R =
1 − j 3 − 2j

∗ 2−j 1+j 2+j j 7 6 + 3j
R R = =
−j 3 + 2j 1 − j 3 − 2j 6 − 3j 14
√
kRk = (7 + 14)1/2 = 21
Observe in this example that if rij denotes the ijth entry in R, then
 1/2
XX
kRk =  |rij |2  .
i j
This holds in general.

Now we can define the norm of a stable p × m transfer matrix G(s). Note that for each ω, G(jω)
is a p × m complex matrix.
H2 -Norm
∞ 1/2
1
Z
kGk2 = trace [G(jω)∗ G(jω)] dω
20 i −∞
Note that the integrand equals the square of the first-definition norm of G(jω).
Concerning this definition is an important input-output fact. Let G be a stable, causal, LTI
system with input u of dimension m and output y of dimension p. Let ei , i = 1, . . . , m, denote
the standard basis vectors in Rm . Thus, δei is an impulse applied to the ith input; Gδei is the
corresponding output. Then the H2 -norm of the transfer matrix G is related to the average L2 -
norm of the output when impulses are applied at the input channels.
Pm
Theorem 8.1 kGk22 = 2
i=1 kGδei k2
Thus kG2 k2 is an average system gain for known inputs.

It is useful to be able to compute kGk2 by state-space methods. Let

A B
G(s) = ,
C D
with A stable, that is, all eigenvalues with negative real part. This matrix notation stands for the
transfer matrix:

A B
:= C(sI − A)−1 B + D,
C D
Then kGk2 = ∞ unless D = 0, in which case the following procedure does the job:
Step 1 Solve for L:
AL + ALT + BB T = 0.
Thus L equals the controllability Gramian.
Step 2 kGk22 = trace CLC T
Consider the standard setup shown here:
z w

G

y u
- K
We must define the concept of internal stability for this setup. Start with a minimal realization of
G:

A B
G(s) = .
C D
The input and output of G are partitioned as

w z
, .
u y
This induces a corresponding partition of B, C, and D:

C1 D11 D12
B1 B2 , , .
C2 D21 D22
We shall assume that D22 = 0, that is, the transfer matrix from u to y is strictly proper. This is
a condition to guarantee existence of closed-loop transfer matrices. Thus the realization for G has
the form
 
A B1 B2
G(s) =  C1 D11 D12  .
C2 D21 0
Also, bring in a minimal realization of K:

AK BK
K(s) = .
CK D K
Now set w = 0 and write the state equations describing the controlled system:
ẋ = Ax + B2 u
y = C2 x
ẋK = AK x K + B K y
u = CK xK + DK y.
Eliminate u and y:

ẋ A + B2 DK C2 B2 CK x
= .
ẋK BK C2 AK xK
We call this latter matrix the closed-loop A-matrix. It can be checked that its eigenvalues do not
depend on the particular minimal realizations chosen for G and K. The closed-loop system is said
to be internally stable if this closed-loop A-matrix is stable, that is, all its eigenvalues have negative
real part. It can be proved that, given G, an internally stabilizing K exists iff (A, B2 ) is stabilizable
and (C2 , A) is detectable.
Let Tzw denote the system from w to z, with transfer matrix Tzw (s). The H2 -optimal control
problem is to compute an internally stabilizing controller K that minimizes kTzw k2 . The following
conditions guarantee the existence of an optimal K:
(A1) (A, B2 ) is stabilizable and (C2 , A) is detectable;
(A2) the matrices D12 and D21 have full column and row rank, respectively;
(A3) the matrices

A − jω B2 A − jω B1
,
C1 D12 C2 D21
have full column and row rank, respectively, ∀ω;
(A4) D11 = 0.
The first assumption is, as mentioned above, necessary and sufficient for existence of an internally
stabilizing controller. In (A2) full column rank of D12 means that the control signal u is fully
weighted in the output z. This is a sensible assumption, for if, say, some component of u is not
weighted, there is no a priori reason for the optimal controller not to try to make this component
unbounded. Dually, full row rank of D21 means that the exogenous signal w fully corrupts the mea-
sured signal y; it’s like assuming noise for each sensor. Again, this is sensible, because otherwise the
optimal controller may try to differentiate y, that is, the controller may be improper. Assumption
(A3) is merely technical—an optimal controller may exist without it. In words, the assumption says
there are no imaginary axis zeros in the cross systems from u to z and from w to y. Finally, (A4)
guarantees that kTzw k2 is finite for every internally stabilizing and strictly proper controller (recall
that Tzw must be strictly proper).
The problem is said to be regular if assumptions (A1) to (A4) are satisfied. Sometimes when
we formulate a problem they are not initially satisfied; for example, we may initially not explicitly
model sensor noise. Then we must modify the problem so that the assumptions are satisfied. This
process is called regularization.
Under these assumptions, the MATLAB commands h2syn and h2lqg compute the optimal con-
troller. These functions are part of the Robust Control Toolbox of MATLAB. The following example
illustrates the H2 design technique.
Example Bilateral hybrid telerobot. The setup is shown here:
?
fh vm vs
- j - Gm - Gs j
−6 K − 6fe
fm fs
Two robots, a master, Gm , and a slave, Gs , are controlled by one controller, K. A human provides
a force command, fh , to the master, while the environment applies a force, fe , to the slave. The
controller measures the two velocities, vm and vs , together with fe via a force sensor. In turn it
provides two force commands, fm and fs , to the master and slave. Ideally, we want motion following
(vs = vm ), a desired master compliance (vm a desired function of fh ), and force reflection (fm = fe ).
For simplicity of computation we shall take Gm and Gs to be SISO with transfer functions
1 1
Gm (s) = , Gs (s) = .
s 10s
We shall design K for two test inputs, namely, fe (t) is the finite-width pulse

10, 0 ≤ t ≤ 0.2
fe (t) = (8.1)
0, t > 0.2,
indicating an abrupt encounter between the slave and a stiff environment, and fh (t) is the triangular
pulse

 2t, 0≤t≤1
fh (t) = −2t + 4, 1 ≤ t ≤ 2 (8.2)
0, t > 2,

to mimic a ramp-up, ramp-down command.

The generalized error vector is taken to have four components: the velocity error vm − vs ; the
compliance error fh − vm (for simplicity, the desired compliance is assumed to be vm = fh ); the
force-reflection error fm − fe ; and the slave actuator force. The last component is included as part
of regularization, that is, to penalize excessive force applied to the slave. Introducing four weights
to be decided later, we arrive at the generalized error vector
 
αv (vm − vs )
 αc (fh − vm ) 
z=  αf (fm − fe )  .

αs fs
The Laplace transforms of fe and fh are not rational:
10 2 −s 2
1 − e−0.2s ,

Fe (s) = Fh (s) = 1 − e .
s s2
To get a tractable problem, we shall use second- and third-order Padé approximations,
T s (T s)2 T s (T s)2

−T s
e ≈ 1− + 1+ +
2 12 2 12
and
T s (T s)2 (T s)3 T s (T s)2 (T s)3

−T s
e ≈ 1− + − 1+ + + .
2 10 120 2 10 120
Using the third-order approximation for Fe (s) and the second-order one for Fh (s), we get
0.2 0.23 s2 0.2s (0.2s)2 (0.2s)3

Fe (s) ≈ 20 + 1+ + +
2 120 2 10 120
=: Ge (s)
, 2
s s2
Fh (s) ≈ 2 1+ +
2 12
=: Gh (s).
Incorporating these two prefilters into the preceding block diagram leads to this:
wh- vm- v we
?
Gh - h - Gm s Gs h Ge
−6 K −6
fm fs
The two exogenous inputs wh and we are unit impulses. The vector of exogenous inputs is therefore

wh
w= .
we
This figure compares fh (t) with the impulse response of Gh (fh (t) dash and the impulse response
of Gh solid):
And this figure is for fe (t) (fe (t) dash and the impulse response of Ge solid):
The error in the second plot is larger because fe (t) is not continuous. The control system is shown
in here
z w

G

y u
- K
where z and w are as above and

 
fe
fm
y =  vs  , u = .
fs
vm
Beginning with state models for Gh , Gm , Gs , Ge , namely,

Ah Bh Am Bm As Bs Ae Be
, , , ,
Ch 0 Cm 0 Cs 0 Ce 0
with corresponding states xh , xm , xs , xe , and defining the state
 
xm
 xs 
x= 
 xe 
xh
lead to the following state model for G:
 
A B1 B2
 C1 0 D12  :=
C2 0 0
 
Am 0 0 B m Ch 0 0 −Bm 0

 0 As B s Ce 0 0 0 0 −Bs 

 0 0 Ae 0 0 Be 0 0  

 0 0 0 Ah Bh 0 0 0  

 αv Cm −αv Cs 0 0 0 0 0 0  
−αc Cm 0 0 αc Ch 0 0 0 0 
 
. (8.3)

0 0 −αf Ce 0 0 0 αf I 0 


0 0 0 0 0 0 0 αs I 
 

 
 
 

 0 0 Ce 0 0 0 0 0  
 0 Cs 0 0 0 0 0 0 
Cm 0 0 0 0 0 0 0
For the data at hand, D21 = 0, so (A2) fails. Evidently, the condition D21 = 0 reflects the fact
that no sensor noise was modelled, that is, perfect measurements of vm , vs , fe were assumed. Let us
add sensor noises, say of magnitude . Then w is augmented to a 5-vector and the state matrices
of G change appropriately so that the realization becomes
 
A 0 B1 B2
 C1 0 0 D12  .
C2 I 0 0
Some trial-and-error is required to get suitable values for the weights; the following values give
reasonable responses:
αv = 10, αc = 5, αf = 10, αs = 0.01, = 0.1.
The MATLAB function h2syn can be used to compute the optimal controller. The next figure
shows plots of vs (t) and vm (t) when the system is commanded by fh (t) (also shown) (vs (solid), vm
(dash), and fh (dot)):
The velocity tracking and compliance are quite good.

The next figure shows the response of fm (t) commanded by fe (t) (fm (solid) and fe (dash)).
The force reflection is evident, though there is some oscillation in fm (t).

Here’s the MATLAB code for this example:
%
% Program by Dan Davison
%
% Program summary:
%
% (1) - find state-space model of G and regularize it
% (2) - use H2SYN (in mu-tools) to find optimal K
% - controller is stored in AK,BK,CK,DK
% (3) - simulate response to two types of inputs
clear
%
% (1) - SETUP STATE-SPACE MODEL FOR G
%
numG_m = [1];
denG_m = [1 0];
[A_m,B_m,C_m,D_m] = tf2ss(numG_m,denG_m);
numG_s = [1];
denG_s = [10 0];
[A_s,B_s,C_s,D_s] = tf2ss(numG_s,denG_s);
numG_e = 20*[0.2^3/120 0 0.2/2];

denG_e = [0.2^3/120 0.2^2/10 0.2/2 1];
[A_e,B_e,C_e,D_e] = tf2ss(numG_e,denG_e);
numG_h = [2];
temp = [1/12 1/2 1];
denG_h = conv(temp,temp);
[A_h,B_h,C_h,D_h] = tf2ss(numG_h,denG_h);
[n_m,m_m]=size(B_m);
[n_s,m_s]=size(B_s);
[n_e,m_e]=size(B_e);
[n_h,m_h]=size(B_h);
[p_m,n_m]=size(C_m);
[p_s,n_s]=size(C_s);
[p_e,n_e]=size(C_e);
[p_h,n_h]=size(C_h);
A = [ A_m zeros(n_m,n_s) zeros(n_m,n_e) B_m*C_h

zeros(n_s,n_m) A_s B_s*C_e zeros(n_s,n_h)
zeros(n_e,n_m) zeros(n_e,n_s) A_e zeros(n_e,n_h)
zeros(n_h,n_m) zeros(n_h,n_s) zeros(n_h,n_e) A_h ];
tmp = [zeros(n_m,m_h) zeros(n_m,m_e)

zeros(n_s,m_h) zeros(n_s,m_e)
zeros(n_e,m_h) B_e
B_h zeros(n_h,m_e)];
B1 = [zeros(n_m+n_s+n_e+n_h,3) tmp];
B2 = [ -B_m zeros(n_m,m_s)
zeros(n_s,m_m) -B_s
zeros(n_e,m_m) zeros(n_e,m_s)
zeros(n_h,m_m) zeros(n_h,m_s)];
B = [B1 B2];
% weights on, resp, v_m - v_s, f_h - v_m, f_m - f_e, f_s
w_v = 10;
w_z = 5;
w_f = 10;
w_s = .01;
weight = diag([w_v w_z w_f w_s]);
C1 = [ C_m -C_s zeros(p_m,n_e) zeros(p_m,n_h)

-C_m zeros(p_m,n_s) zeros(p_m,n_e) C_h
zeros(p_s,n_m) zeros(p_s,n_s) -C_e zeros(p_s,n_h)
zeros(p_e,n_m) zeros(p_e,n_s) zeros(p_e,n_e) zeros(p_e,n_h) ];
C1 = weight*C1;
C2 = [ zeros(p_e,n_m) zeros(p_e,n_s) C_e zeros(p_e,n_h)

zeros(p_s,n_m) C_s zeros(p_s,n_e) zeros(p_s,n_h)

C_m zeros(p_m,n_s) zeros(p_m,n_e) zeros(p_m,n_h)];
C = [C1;C2];
epsilon = 0.1; % weight on added noise
D11 = zeros(4,5);
D12 = [0 0
0 0
1 0
0 1];
D12 = weight*D12;
D21= [epsilon 0 0 0 0
0 epsilon 0 0 0
0 0 epsilon 0 0];
D22= zeros(3,2);
D = [D11 D12;D21 D22];
% run h2syn
plant=pck(A,B,C,D);
[kk,gg,kfi,gfi,hamx,hamy]=h2syn(plant,3,2,2);
[AK,BK,CK,DK]=unpck(kk);
% set up for simulation
[nk,mk]=size(BK);
[pk,nk]=size(CK);
CK1 = CK(1,1:nk);
CK2 = CK(2,1:nk);
BK1 = BK(1:nk,1);
BK2 = BK(1:nk,2);
BK3 = BK(1:nk,3);
if norm(DK) > eps

error(’DK is not zero. Program needs updating.’)
end
%
% (3) - SIMULATE CLOSED LOOP SYSTEM

%
% Create system M from [f_e,f_h] to [v_m,v_s,f_e,f_m]:
AM = [ A_m zeros(n_m,n_s) -B_m*CK1 ;...

zeros(n_s,n_m) A_s -B_s*CK2 ;...
BK3*C_m BK2*C_s AK ];
BM = [zeros(n_m,m_s) B_m ;...

B_s zeros(n_s,m_m);...
BK1 zeros(nk,m_m)];
CM = [ C_m zeros(1,n_s) zeros(1,nk) ;...

zeros(1,n_m) C_s zeros(1,nk) ;...
zeros(1,n_m) zeros(1,n_s) zeros(1,nk) ;...
zeros(1,n_m) zeros(1,n_s) CK1 ];
DM = [0 0; 0 0; 1 0; 0 0];
% first, simulate with f_e = 0
Tmax = 10;
delT = .01;
T1=0:delT:Tmax;
fe = zeros(length(T1),1);
fh = zeros(length(T1),1);
for i=1:length(T1)*1/Tmax
t = (i-1)*Tmax/length(T1);
fh(i) = 2*t;
end
for i=length(T1)*1/Tmax+1:length(T1)*2/Tmax
fh(i) = -2*t+4;
end
Output = lsim(AM,BM,CM,DM,[fe fh],T1);

vm1 = Output(:,1);
vs1 = Output(:,2);
fe1 = Output(:,3);
fm1 = Output(:,4);
fh1=fh;
% second, simulate with f_h = 0
Tmax = 10;
delT = .01;
T2=0:delT:Tmax;
fe = zeros(length(T2),1);
fh = zeros(length(T2),1);
for i=1:length(T2)*.2/Tmax
fe(i) = 10;
end
Output = lsim(AM,BM,CM,DM,[fe fh],T2);

vm2 = Output(:,1);
vs2 = Output(:,2);
fe2 = Output(:,3);
fm2 = Output(:,4);
%plot(T1, [vs1 vm1 fh1])

plot(T2,[fm2 fe2])
After that tutorial on the use of H2 optimal control, we return to the theory.
8.2 Lyapunov Equation

The equation
AT X + XA + M = 0
is called a Lyapunov equation. Here A, M , X are all square matrices, say n × n, with M
symmetric.
One situation is where A and M are given and the equation is to be solved for X. Existence
and uniqueness are easy to establish in principle. Define the linear map
L : Rn×n → Rn×n , L(X) = AT X + XA.
Then the Lyapunov equation has a solution X iff M ∈ Im L; if this condition holds, the solution is
unique iff L is one-to-one, hence invertible. Let σ()˙ denote the set of eigenvalues—the spectrum—of
a matrix or linear transformation. It can be shown that
σ(L) = {λ1 + λ2 : λ1 , λ2 ∈ σ(A)}.

So the Lyapunov equation has a unique solution iff A has the property that no two of its eigenvalues
add to zero. For example, if A is stable, the unique solution is
Z ∞
T
X= eA t M eAt dt.
0
T
This can be proved as follows. Let P (t) = eA t M eAt . Then
Ṗ (t) = AT P (t) + P (t)A.
Integrate from t = 0 to ∞.
We’ll be more interested in another situation—where we want to infer stability of A.
Theorem 8.2 Suppose A, M , X satisfy the Lyapunov equation, (M, A) is detectable, and M and
X are positive semi-definite. Then A is stable.
Proof For a proof by contradiction, suppose A has some eigenvalue λ with Re λ ≥ 0. Let x be
a corresponding eigenvector. Pre-multiply the Lyapunov equation by x∗ , the complex-conjugate
transpose, and post-multiply by x to get
(2Re λ)x∗ Xx + x∗ M x = 0.
Both terms on the left are ≥ 0. Hence x∗ M x = 0, which implies that M x = 0 since M ≥ 0. Thus

A − λI
x = 0.
M
By detectability we must have x = 0, a contradiction.
8.3 Spectral Subspaces

Let A be a square matrix. There’s a similarity transformation that maps A to a matrix of the form
−
A 0
,
0 A+
where the eigenvalues of A− are all in Re s < 0 and those of A+ in Re s ≥ 0. That is to say,
there are two invariant subspaces of A, say X − (A) and X + (A), that are independent; furthermore,
if v1 , . . . , vk is a basis for X − (A) and vk+1 , . . . , vn for X + (A), then the matrix V formed from these
basis vectors satisfies
−
−1 A 0
V AV = .
0 A+
These two subspaces are called the spectral subspaces of A.
One can get V from the generalized eigenvectors, although you have to keep the vectors real. A
simpler way, at least conceptually, is as follows. Let p(s) denote the characteristic polynomial of A.
It can be factored as
p(s) = p− (s)p+ (s),
where the roots of p− are in Re s < 0 and those of p+ in Re s ≥ 0. Then
X − (A) = Ker p− (A), X + (A) = Ker p+ (A).
8.4 Riccati Equation

Let A, P , Q be real n × n matrices with P and Q symmetric. Define the 2n × 2n matrix

A −P
H := .
−Q −AT
A matrix of this form is called a Hamiltonian matrix.

It is claimed that σ(H) is symmetric about the imaginary axis. To prove this, introduce the
2n × 2n matrix

0 −I
J :=
I 0
having the property J 2 = −I. Then
J −1 HJ = −JHJ = −H T
so H and −H T are similar. Thus λ is an eigenvalue iff −λ is.

Now assume H has no eigenvalues on the imaginary axis. Then it must have n in Re s < 0
and n in Re s > 0. Thus the two spectral subspaces X − (H) and X + (H) both have dimension n.
Let’s focus on X − (H). Finding a basis for it, stacking the basis vectors up to form a matrix, and
partitioning the matrix, we get

− X1
X (H) = Im ,
X2
where X1 , X2 ∈ Rn×n . If X1 is nonsingular, i.e., if the two subspaces

− 0
X (H), Im
I
are complementary, we can set X := X2 X1−1 to get

− I
X (H) = Im .
X
Notice that X is then uniquely determined by H, i.e. H 7→ X is a function. We shall denote this
function by Ric and write X = Ric(H).
To recap, Ric is a (nonlinear) function R2n×2n → Rn×n which maps H to X where

− I
X (H) = Im .
X
The domain of Ric, denoted dom Ric, consists of Hamiltonian matrices H with two properties,
namely, H has no eigenvalues on the imaginary axis and the two subspaces

− 0
X (H), Im
I
are complementary.
Some properties of X are given below.
Lemma 8.1 Suppose H ∈ dom Ric and X = Ric(H). Then
(i) X is symmetric
(ii) X satisfies the algebraic Riccati equation
AT X + XA − XP X + Q = 0
(iii) A − P X is stable.
Proof (i) Let X1 , X2 be as above. It’s claimed that
X1T X2 is symmetric. (8.4)
To prove this, note that there exists a stable matrix H − in Rn×n such that

X1 X1
H = H −.
X2 X2
Pre-multiply this equation by

T
X1
J
X2
to get
T T
X1 X1 X1 X1
JH = J H −. (8.5)
X2 X2 X2 X2
Now JH is symmetric; hence so is the left-hand side of (8.5); hence so is the right:
T
(−X1T X2 + X2T X1 )H − = H − (−X1T X2 + X2T X1 )T
T
= −H − (−X1T X2 + X2T X1 ).
This is a Lyapunov equation. Since H − is stable, the unique solution is
−X1T X2 + X2T X1 = 0.
This proves (8.4).

We have XX1 = X2 . Pre-multiply by X1T and then use (8.4) to get that X1T XX1 is symmetric.
Since X1 is nonsingular, this implies that X is symmetric too.
(ii) Start with the equation

X1 X1
H = H−
X2 X2
and extract X1 to get

I I
H X1 = X1 H − .
X X
Post-multiply by X1−1 :

I I
H = X1 H − X1−1 . (8.6)
X X

Now pre-multiply by X −I :

I
X −I H = 0.
X
This is precisely the Riccati equation.

(iii) Pre-multiply (8.6) by I 0 to get
A − P X = X1 H − X1−1 .
Thus A − P X is stable because H − is.
The following result gives verifiable conditions under which H belongs to dom Ric.
Theorem 8.3 Suppose H has the form
A −BR−1 B T

H=
−Q −AT
with Q ≥ 0, R > 0, (A, B) stabilizable, and (Q, A) detectable. Then H ∈ dom Ric. Let X = Ric(H)
and F = −R−1 B T X. Then X ≥ 0 and A + BF is stable. Finally, if (Q, A) is observable, then
X > 0.
Proof We’ll first show

that H has no imaginary eigenvalues. Suppose, on the contrary, that jω is
x
an eigenvalue and a corresponding eigenvector. Then
z
Ax − BR−1 B T z = jωx (8.7)

T
−Qx − A z = jωz. (8.8)
Re-arrange:
(A − jωI)x = BR−1 B T z (8.9)

∗
−(A − jωI) z = Qx (8.10)
Thus
hz, (A − jωI)xi = hz, BR−1 B T zi = kR−1/2 B T zk2

−hx, (A − jωI)∗ zi = hx, Qxi = kQ1/2 xk2
and hence
hz, (A − jωI)xi = kR−1/2 B T zk2

h(A − jωI)x, zi = −kQ1/2 xk2 .
Thus h(A − jωI)x, zi is real and
−kQ1/2 xk2 = h(A − jωI)x, zi = kR−1/2 B T zk2 .
Therefore B T z = 0 and Qx = 0. So from (8.9) and (8.10)

(A − jωI)x = 0
(A − jωI)∗ z = 0.
Combine the last four equations to get
z ∗ A − jωI R−1 B = 0

A − jωI
x = 0.
Q
By stabilizability and detectability it follows that x = z = 0, a contradiction.
Next, we’ll show that

− 0
X (H), Im
I
are complementary. This requires a preliminary step. As in the proof of Lemma 8.1 bring in
X1 , X2 , H − so that

− X1
X (H) = Im
X2

X1 X1
H = H −. (8.11)
X2 X2
We want to show that X1 is nonsingular, i.e. Ker X1 = 0. First, it is claimed that Ker X1 is
−

H -invariant. To prove this, let x ∈ Ker X1 . Pre-multiply (8.11) by I 0 to get
AX1 − BR−1 B T X2 = X1 H − . (8.12)
Pre-multiply by xT X2T , post-multiply by x, and use the fact that X2T X1 is symmetric (see (8.4)) to
get
−xT X2T BR−1 B T X2 x = 0.
Thus B T X2 x = 0. Now post-multiply (8.12) by x to get X1 H − x = 0, i.e. H − x ∈ Ker X1 . This

proves the claim.
Now to prove that X1 is nonsingular, suppose on the contrary that Ker X1 6= 0. Then H
restricted to Ker X1 has an eigenvalue, λ, and a corresponding eigenvector, x:
H − x = λx (8.13)
Re λ < 0, 0 6= x ∈ Ker X1 .

Pre-multiply (8.11) by 0 I :
−QX1 − AT X2 = X2 H − . (8.14)
Post-multiply by x and use (8.13):
(AT + λI)X2 x = 0.
Since B T X2 x = 0 too from above, we have
x∗ X2T A + λI B = 0.

Then stabilizability implies X2 x = 0. But if X1 x = 0 and X2 x = 0, then x = 0, a contradiction.

This concludes the proof of complementarity, and hence the proof that H ∈ domRic.
That A + BF is stable follows from part (iii) of Lemma 8.1 (P = BR−1 B T ).
To show that X ≥ 0, we have the Riccati equation
AT X + XA − XBR−1 B T X + Q = 0,
or equivalently
(A + BF )T X + X(A + BF ) + XBR−1 B T X + Q = 0.
Thus
Z ∞
T
X= e(A+BF ) t (XBR−1 B T X + Q)e(A+BF )t dt. (8.15)
0
Since XBR−1 B T X + Q is positive semi-definite, so is X.

Finally, suppose (Q, A) is observable. We’ll show that if xT Xx = 0, then x = 0; thus X > 0.
Pre-multiply (8.15) by xT and post-multiply by x:
Z ∞ Z ∞
xT Xx = kR−1/2 B T Xe(A+BF )t xk2 dt + kQ1/2 e(A+BF )t xk2 dt.
0 0
Thus if xT Xx = 0, then Xx = 0 and
Qe(A+BF )t x = 0, ∀t ≥ 0.
But this implies that x belongs to the unobservable subspace of (Q, A) and so x = 0.
8.5 The LQR Problem

The LQR problem is, given a plant model
ẋ = Ax + Bu, x(0) = x0 ,
find a control input u that minimizes the cost functional

Z ∞
J= x(t)T Qx(t) + u(t)T Ru(t)dt.
0
We have all the machinery now to solve this.

Assume Q ≥ 0, R > 0, (A, B) stabilizable, and (Q, A) detectable. Define
A −BR−1 B T

H= , X = Ric(H), F = −R−1 B T X.
−Q −AT
By Theorem 8.3, X is well-defined, X ≥ 0, and A + BF is stable. The associated Riccati equation
is
AT X + XA − XBR−1 B T X + Q = 0,
Theorem 8.4 The control signal that minimizes J is u = F x, it is the unique optimal control, and
for this control signal J = xT0 Xx0 .
The proof needs a lemma. Let us denote by L2 the class of signals that are square-integrable
on the time interval [0, ∞).
Lemma 8.2 If J is finite, then x(t) → 0 as t → ∞.
Proof Assume J < ∞. Then u ∈ L2 because R > 0. Let C := Q1/2 and y := Cx. Then y ∈ L2
too. By detectability, there exists K such that A + KC is stable. A standard observer to estimate
x is
x̂˙ = Ax̂ + Bu + K(C x̂ − y)
= (A + KC)x̂ + Bu − Ky.
Since A + KC is stable, u ∈ L2 , and y ∈ L2 , so x̂(t) → 0. By observer theory, x̂(t) − x(t) → 0.
Thus x(t) → 0.
Proof of Theorem The proof is a trick using the completion of a square. Let u be an arbitrary
control input for which J is finite. We shall differentiate the quadratic form x(t)T Xx(t) along the
solution of the plant equation. To simplify notation, we suppress dependence on t. We have
d T
(x Xx) = ẋT Xx + xT X ẋ
dt
= (Ax + Bu)T Xx + xT X(Ax + Bu)
= xT (AT X + XA)x + 2uT B T Xx
= xT (XBR−1 B T X − Q)x + 2uT B T Xx from the Riccati equation
= −xT Qx + xT XBR−1 B T Xx + 2uT B T Xx
= −xT Qx + xT XBR−1 B T Xx + 2uT B T Xx + (uT Ru − uT Ru)
—this was the completion of squares trick
= −xT Qx − uT Ru + kR−1/2 B T Xx + R1/2 uk2 .
Rearranging terms we have
d T
xT Qx + uT Ru = − (x Xx) + kR−1/2 B T Xx + R1/2 uk2 .
dt
Now integrate from t = 0 to t = ∞ and use the lemma:
Z ∞
T
J = x0 Xx0 + kR−1/2 B T Xx + R1/2 uk2 dt.
0
Thus J is minimum iff R−1/2 B T Xx + R1/2 u ≡ 0, i.e., u = F x. The other conclusion follows.
The LQR solution provides a very convenient way to stabilize an LTI plant. Given A, B, select
Q, R with Q ≥ 0, (Q, A) detectable, and R > 0. Then the optimal F stabilizes A + BF . This is the
preferred method over pole assignment.
The LQR solution is rarely implementable as it stands, because it requires that there be a sensor
for each state variable; that is, x must be fully sensed. We look next at the generalization of the
LQR problem to the more general case where x is not fully sensed.
8.6 Solution of the H2 Problem

In this section we see a derivation of the solution of the H2 problem in a special case. We take the
realization of the transfer matrix G to be of the form
 
A B1 B2
G(s) =  C1 0 D12  .
C2 D21 0
The following assumptions are made:
(A1) (A, B1 ) is stabilizable and (C1 , A) is detectable
(A2) (A, B2 ) is stabilizable and (C2 , A) is detectable
T

(A3) D12 C1 D12 = 0 I

B1 T 0
(A4) D21 =
D21 I
Assumption (A2) is necessary and sufficient for G to be internally stabilizable. Assumption (A1)
is for a technical reason: Together with (A2) it guarantees that two Hamiltonian matrices (H2 and
J2 below) belong to dom(Ric).
Assumption (A3) means that C1 x and D12 u are orthogonal and that the latter control penalty
is nonsingular and normalized. In the conventional LQG setting this means that there is no cross
weighting between the state and control input, and that the control weight matrix is the identity
. Other nonsingular control weights can easily be converted to this problem with a change of
coordinates in u. Relaxing the orthogonality condition introduces a few extra terms in the controller
formulas.
Finally, assumption (A4) is dual to (A3) and concerns how the exogenous signal w enters G: The
plant disturbance and the sensor noise are orthogonal, and the sensor noise weighting is normalized
and nonsingular. Two additional assumptions that are implicit in the assumed realization for G is
that D11 = 0 and D22 = 0. Relaxing these assumptions complicates the formulas substantially.
By Theorem 8.3 the Hamiltonian matrices
−B2 B2T AT −C2T C2

A
H2 := , J2 :=
−C1T C1 −AT −B1 B1T −A
belong to dom(Ric) and, moreover, X2 := Ric(H2 ) and Y2 := Ric(J2 ) are positive semi-definite.
Define
F2 := −B2T X2 , L2 := −Y2 C2T
AF2 := A + B2 F2 , C1F2 := C1 + D12 F2

AL2 := A + L2 C2 , B1L2 := B1 + L2 D21
Â2 := A + B2 F2 + L2 C2

AF2 I AL2 B1L2
Gc (s) := , Gf (s) :=
C1F2 0 I 0
and let Tzw denote the transfer function from w to z.
Theorem 8.5 The unique optimal controller is

Â2 −L2
Kopt (s) := .
F2 0
Moreover,
min kTzw k22 = kGc B1 k22 + kF2 Gf k22 .
The first term in the minimum cost, kGc B1 k22 , is associated with optimal control with state
feedback and the second, kF2 Gf k22 , with optimal filtering. These two norms can easily be computed
as follows:
kGc B1 k22 = trace (B1T X2 B1 )
ATF2 X2 + X2 AF2 + C1F

T
C
2 1F2
=0
kF2 Gf k22 = trace (F2 Y2 F2T )
AL2 Y2 + Y2 ATL2 + B1L2 B1L
T
2
= 0.
The controller Kopt has a beautiful separation structure: The controller equations can be written
as
x̂˙ = Ax̂ + B2 u + L2 (C2 x̂ − y)
u = F2 x̂.
The matrix F2 is the optimal feedback gain were x directly measured; L2 is the optimal filter gain;
x̂ is the optimal estimate of x.
The proof involves optimality in an inner-product space; so projection theory applies. Let H2
denote the (Hardy) space of transfer matrices P (s) that are stable and strictly proper. This has a
natural inner product,
Z ∞
1
< P, Q >= trace [P (jω)∗ Q(jω)] dω,
2π −∞
⊥
which is consistent with our norm definition: kP k22 =< P, P >. Likewise, let H2 denote the space
of transfer matrices that are antistable (all poles in Re s > 0) and strictly proper. Same inner
⊥
product, same norm. Then H2 and H2 are orthogonal spaces:
⊥
P ∈ H2 , Q ∈ H2 =⇒ < P, Q >= 0.
⊥
The sum H2 ⊕ H2 consists of all transfer matrices that are strictly proper and have no poles on
the imaginary axis.
Finally, let us introduce the notation P ∼ (s) := P (−s)T . A stable matrix P (s) is said to be
allpass if P ∼ P = I. For example,
s−1
s+1
is an allpass function.
Proof of Theorem 8.5 Let K be any proper, stabilizing controller. Start with the system
equations
ẋ = Ax + B1 w + B2 u
z = C1 x + D12 u
and define a new control variable, v := u − F2 x. The equations become
ẋ = AF2 x + B1 w + B2 v
z = C1F2 x + D12 v
or in the frequency-domain
Z = Gc B1 W + U V.
This implies that
Tzw = Gc B1 + U Tvw
⊥
You will prove in Problem 5 the following fact: U is allpass and U ∼ Gc belongs to H2 . This implies
that Gc B1 and U Tvw are orthogonal matrices in H2 (Tvw belongs to H2 by internal stability). So
from the previous equation
kTzw k22 = kGc B1 k22 + kTvw k22 .
Now look at how v is generated:

v w
A B1 B2
−F2 0 I
C2 D21 0
y
- K
Note that K stabilizes G iff K stabilizes the above system (the two closed-loop systems have identical
A-matrices). So
min kTzw k22 = kGc B1 k22 + min kTvw k22

K K
and therefore the theorem will be proved once we show the following: For the setup in the previous
block diagram, the unique optimal controller is

A + B2 F2 + L2 C2 −L2
F2 0
and the minimum value of kTvw k2 equals kF2 Gf k2 . Notice in this setup that A + B2 F2 is stable.
By the assignment C1 ← −F2 , the previous statement becomes this: For
 
A B1 B2
G(s) =  C1 0 I 
C2 D21 0
with A − B2 C1 stable, the unique optimal controller is

A − B2 C1 + L2 C2 L2
C1 0
and the minimum cost is kC1 Gf k2 .

The dual of the last statement is this: For
 
A B1 B2
G(s) =  C1 0 D12 
C2 I 0
with A − B1 C2 stable, the unique optimal controller is

A + B2 F2 − B1 C2 B1
F2 0
and the minimum cost is kGc B1 k2 .

To prove this, apply the controller and let x̂ denote its state. Then the system equations are
ẋ = Ax + B1 w + B2 u
z = C1 x + D12 u
y = C2 x + w
x̂˙ = (A + B2 F2 − B1 C2 )x̂ + B1 y
u = F2 x̂,
so
x̂˙ = Ax̂ + B2 u + B1 (y − C2 x̂).

Defining e := x − x̂, we get
ė = (A − B1 C2 )e.
It’s now easy to infer internal stability from stability of A + B2 F2 and A − B1 C2 . For zero initial
conditions on x, x̂, we have e(t) ≡ 0. Hence
u = F2 x̂ = F2 x. (8.16)
For every proper, stabilizing controller the equation
kTzw k22 = kGc B1 k22 + kTvw k22
is still valid, showing that
kTzw k2 ≥ kGc B1 k2 .
But for the present controller, (8.16) implies that v ≡ 0, i.e., Tvw = 0. Thus the present controller
is optimal and the minimum cost is kGc B1 k2 . Finally, for uniqueness it can be shown (an exercise)
that the unique solution of Tvw = 0 is the controller above.
8.7 Problems
1. Take G(s) = 1/(s + 1) and compute the H2 -norm kGk2 by the three methods: time-domain,
state-space, residue theorem
2. Give an interesting (i.e., nontrivial) example of a 2 × 1 allpass matrix.
3. Consider
ẋ = Ax + Bu, x(0) = x0
with A stable. True or false: For every u in L2 [0, ∞), x(t) tends to 0 as t tends to ∞.
4. Suppose u and y are scalar-valued signals and the transfer function from u to y is 1/s2 . For
the standard canonical realization (A, B, C) consider the optimization problem
Z ∞
min ρy(t)2 + u(t)2 dt,
u=F x 0
where ρ is positive. Find the optimal F . Study the eigenvalues of A + BF as ρ → 0 and as

ρ → ∞.
⊥
5. Prove that U is allpass and U ∼ Gc ∈ H2 .
6. Prove uniqueness in Theorem 8.5.
7. You know that right half-plane zeros place definite performance limitations on the control of
a system. This exercise illustrates this fact in the present context.
Consider the system
ẋ = Ax + Bu, x(0) = x0
z = Cx.
Then
Z(s) = C(sI − A)−1 x0 + C(sI − A)−1 BU (s).
If A is stable, we might like to see how small we can make kZk2 by suitable choice of stable
U (s). In particular, we might like to know if kZk2 can be made arbitrarily small.
Let
s−1
C(sI − A)−1 B = .
(s + 2)(s + 3)
Compute
inf kC(sI − A)−1 x0 + C(sI − A)−1 Buk2

U stable
as a function of x0 . For what values of x0 is the infimum equal to zero.
Repeat for
s+1
C(sI − A)−1 B = .
(s + 2)(s + 3)
8. This problem concerns optimization in the space R2 with respect to three norms:
kxk1 = |x1 | + |x2 |

1/2
kxk2 = x21 + x22
kxk∞ = max{|x1 |, |x2 |}

2
Let V denote the one-dimensional subspace spanned by the vector . For each of the
1

0
three norms, find the vector in V that is closest to .
5
9. Consider the feedback system
fh v
- j - P (s) -
−6
K(s)
Both P (s) and K(s) are SISO transfer functions. The plant is P (s) = 1/s and the human
force input fh is as follows:
fh
6
2 −
A
A
1 −
A
A
A
A -
1 2 3 t
The output is a velocity v. It is desired to design a proper transfer function K(s) to achieve
internal stability of the feedback system and minimize the compliance error kfh − vk2 . Set
this up as a problem in H2 optimal control.
10. This problem relates to the LQR problem and whether or not J < ∞ implies u ∈ L2 . Show
that it is true if R is positive definite. Hint: You have to show that if R1/2 u is in L2 , then u
is too.
11. Consider

−1 1
A= .
1 −1
Regard A as the linear map R2 −→ R2 defined by x 7→ Ax. Let V be one of the two 1-
dimensional invariant subspaces and let V be the linear map V −→ R2 given by x 7→ x. Find
the linear map A1 : V −→ V that satisfies the equation V A1 = AV . This map is called the
restriction of A to V.
12. Take the LQR problem with A = B = Q = R = 1. Form the Hamiltonian matrix H given in
Theorem 6.2. Find its invariant subspaces. Are there any of the form

I
Im ?
P
(We saw that the LQR problem reduces to looking for an invariant subspace of this form.)
13. Consider the LQR problem with

0 1 0 q1 0
A= , B= , Q= > 0, R = r > 0.
0 0 1 0 q2
Let λ1 , λ2 denote the eigenvalues of A + BF for the optimal F . Of course, the two eigenvalues
have to be in the left half-plane and be complex conjugates if not real. Are they otherwise
freely assignable by choice of q1 , q2 , r?
Chapter 9
H∞ Optimal Control
The symbol H∞ stands for the space of all stable, proper transfer functions, such as
1 2s − 1 s
, 2 ,
s + 1 s + 5s + 2 s + 1
but not
1
.
s
There’s a natural norm, namely,
kGk∞ = sup |G(jω)|,

ω
but no inner product. Thus H∞ is a Banach space. The space extends to matrices, as we’ll see.
In this chapter we study optimal control design in this space. This chapter begins with a tutorial
overview, followed by some of the underlying theory.
9.1 Overview
Let R be a complex p × m matrix. The singular values of R are defined as the square roots of
the eigenvalues of R∗ R. The maximum singular value of R, denoted σmax (R), has the properties
required of a norm and is our second definition for kRk.
Example
The singular values of

2+j j
R=
1 − j 3 − 2j
equal 4.2505, 1.7128. These are computed via the function svd in MATLAB. Thus kRk = 4.2505.
The importance of this second definition of matrix norm is derived from the following fact. Let
u ∈ Cm and let y = Ru, so y ∈ Cp . The fact is that
σmax (R) = max{kyk : kuk = 1}.
124
CHAPTER 9. H∞ OPTIMAL CONTROL 125
This has the interpretation that if we think of R as a system with input u and output y, then
σmax (R) equals the system’s gain, that is, maximum output norm over all inputs of unit norm.
Now we can define the H∞ norm of a stable p × m transfer matrix G(s):
kGk∞ = sup σmax [G(jω)]

ω
So here we used the second-definition norm of G(jω). If G(s) is scalar-valued, its norm equals the
peak magnitude on the Bode plot.
Concerning this definition is an important input-output fact. Let G be a stable, causal, LTI
system with input u of dimension m and output y of dimension p. The norm H∞ -norm of the
transfer matrix G is related to the maximum L2 -norm of the output over all inputs of unit norm.
Theorem 9.1 kGk∞ = sup{kyk2 : kuk2 = 1}
Thus the major distinction between kGk2 and kGk∞ is that the former is an average system gain
for known inputs, while the latter is a worst-case system gain for unknown inputs.
It is useful to be able to compute kGk∞ by state-space methods. Let

A B
G(s) = ,
C D
with A stable, that is, all eigenvalues with negative real part. The computation of kGk∞ using
state-space methods involves the Hamiltonian matrix
A + B(γ 2 − DT D)−1 DT C γB(γ 2 − DT D)−1 DT

H= ,
−γC T (γ 2 − DDT )−1 C −[A + B(γ 2 − DT D)−1 DT C]T
where γ is a positive number. The matrices γ 2 − DDT , γ 2 − DT D are invertible provided they
are positive definite, equivalently, γ 2 is greater than the largest eigenvalue of DDT (or DT D),
equivalently, γ > σmax (D).
Theorem 9.2 Let γmax denote the maximum γ such that H has an eigenvalue on the imaginary
axis. Then kGk∞ = max{σmax (D), γmax }.
The theorem suggests the following procedure: Plot, versus γ, the distance from the imaginary
axis to the nearest eigenvalue of H; then γmax equals the maximum γ for which the distance equals
zero; then kGk∞ = max{σmax (D), γmax }. A more efficient procedure is to compute γmax by a
bisection search.
The H∞ -optimal control problem is to compute an internally stabilizing controller K that mini-
mizes kTzw k∞ for the standard setup. This problem is much harder than the H2 problem. Instead of
seeking a controller that actually minimizes kTzw k∞ , a simpler problem is to search for a controller
that gives kTzw k∞ < γ, where γ is a pre-specified parameter. If γ is too small, a controller will not
exist, so we need a test for existence. With this, the following procedure leads to a controller that
is close to optimal:
1. Start with a large enough γ so that a controller exists.
2. Test existence for smaller and smaller values of γ until eventually γ is close to the minimum
γ for existence.
3. Compute a controller so that kTzw k∞ < γ.

A bisection search can be used.
The MATLAB command hinfsyn performs this procedure. The regularity assumptions required
are (A1)-(A3), but not (A4), on page 100. The following example illustrates how a typical frequency-
domain design problem can be formulated as one of H∞ -optimization.
Example
The next figure shows a single-loop analog feedback system.
z
6
1 w2 z
6
2
?
W 2 1
6 6
w1 e ?y u
- i - F - i - K - P -
−6
The plant is P and the controller K; F is an antialiasing filter for future digital implementation of
the controller (it is a good idea to include F at the start of the analog design so that there are no
surprises later due to additional phase lag). The basic control specification is to get good tracking
over a certain frequency range, say [0, ω1 ]; that is, to make the magnitude of the transfer function
from w1 to e small over this frequency range. The weighted tracking error is z1 in the figure, where
the weight W is selected to be a lowpass filter with bandwidth ω1 . We could attempt to minimize
the H∞ -norm from w1 to z1 , but this problem is not regular. To regularize it, another input, w2 ,
is added and another signal, z2 , is penalized. The two weights 1 and 2 are small positive scalars.
The design problem is to minimize the H∞ -norm

w1 z1
from w = to z = .
w2 z2
The preceding figure can then be converted to the standard block diagram by stacking the states
of P , F , and W to form the state of G.
The plant transfer function is taken to be
20 − s
P (s) = .
(s + 0.01)(20 + s)
This can be regarded as an approximation of the time-delay system 1s e−40s , an integrator cascaded
with a time delay of 40 time units. With a view toward subsequent digital control with sampling
period h = 0.5, the filter F is taken to have bandwidth π/0.5, the Nyquist frequency ωN :
1
F (s) = .
(0.5/π)s + 1
The weight W is then taken to have bandwidth one-fifth the Nyquist frequency:
2
1
W (s) = .
(2.5/π)s + 1
Finally, 1 and 2 are both set to 0.01.

The next figure shows the results of the design using hinfsyn:
10 1
10 0
10 -1
10 -2
10 -3
10 -4
10 -3 10 -2 10 -1 10 0 10 1 10 2
The solid curve is the Bode magnitude plot of the sensitivity function, that is, the transfer function
from w1 to e, namely, 1/(1 + P KF ). Also shown are the magnitude plots for W (dash) and F (dot).
Evidently, the design has achieved some tracking error attenuation over the bandwidth of W . A
greater degree of attenuation could be achieved by tuning the weights W , 1 , and 2 .
The MATLAB code:
% H_inf analog design example with discretization
% input data
clear
% parameters
h=0.021;
z=20;
[AP,BP,CP,DP]=tf2ss([-1 z],conv([1 .01],[1 z]));
[AF,BF,CF,DF]=tf2ss(1,[0.5/pi 1]);
numW=1;
denW=conv([5/(2*pi) 1],[5/(2*pi) 1]);

[AW,BW,CW,DW]=tf2ss(numW,denW);
[nP,mP]=size(BP);
[nF,mF]=size(BF);
[nW,mW]=size(BW);
eps1=0.01;
eps2=0.01;
% build G
A=[AP zeros(nP,nF) zeros(nP,nW);

-BF*CP AF zeros(nF,nW);
-BW*CP zeros(nW,nF) AW];
B1=[0*BP 0*BP;BF 0*BF;BW 0*BW];
B2=[BP;0*BF;0*BW];
C1=[-DW*CP zeros(1,nF) CW;zeros(1,nP+nF+nW)];
C2=[zeros(1,nP) CF zeros(1,nW)];
D11=[DW 0;0 0];
D12=[0;eps1];
D21=[0 eps2];
D22=0;
% design
p=pck(A,[B1 B2],[C1;C2],[D11 D12;D21 D22]);

[k,g,gfin,ax,ay,hamx,hamy]=hinfsyn(p,1,1,0,.2,.01);
[AK,BK,CK,DK]=unpck(k);
% generate closed-loop analog system S
[nK,mK]=size(BK);
AS=[AP BP*CK zeros(nP,nF);zeros(nK,nP) AK BK*CF;-BF*CP zeros(nF,nK) AF];
BS=[0*BP;0*BK;BF];
CS=[-CP 0*CK 0*CF];
DS=1;
% discretize K
[AK,BK]=c2d(AK,BK,h);
% stability check
Atmp=[AP BP*CF;zeros(nF,nP) AF];

Btmp=[0*BP;BF];
Ctmp=[0*CP CF];
[Atmp,Btmp]=c2d(Atmp,Btmp,h);
Abar=[Atmp Btmp*CK;-BK*Ctmp AK];
max(abs(eig(Abar)));
% analysis
w=logspace(-3,2,200);
j=sqrt(-1);
p=freqrc(AP,BP,CP,DP,w);
f=freqrc(AF,BF,CF,DF,w);
k=dfreqrc(AK,BK,CK,DK,w,h);
r=(1-exp(-j*h*w))./(j*h*w);
tmp=ones(1,length(w))./(1+p.*r.*k.*f);
magS2=abs(tmp);
[magS1,ph]=bode(AS,BS,CS,DS,1,w);
[magW,ph]=bode(AW,BW,CW,DW,1,w);
[magF,ph]=bode(AF,BF,CF,DF,1,w);
loglog(w,magS1,w,magS2)
%loglog(w,magS1,w,magW,w,magF)
9.2 A Simple Feedback Loop

The previous example wasn’t so complicated, but we need an even simpler one in order to isolate
the Nehari problem that lies at the heart of the solution . Consider the block diagram
r(t) e(t) u(t) y(t)
C(s) P (s)
−
with the plant transfer function

1
P (s) = e−s .
2s + 1
It’s BIBO stable, but it has a time delay, which makes it hard to control. For simplicity, let us
rationalize P (s) via a Padé approximation:
1 − 0.5s
P (s) = .
(2s + 1)(1 + 0.5s)
We want to design a controller C(s) so that the feedback loop is stable and also has some measure
of stability robustness. A good way to do this is to require the Nyquist plot of P C to stay outside
the circle centred at −1 and radius, say, 0.2. Let S denote the transfer function from r to e (known
as the sensitivity function). It turns out that kSk−1
∞ equals the distance in the complex plane from
the critical point −1 to the closest point on the Nyquist plot of P C. (Reference: ECE356 course
notes.) Thus stability robustness is equivalent to the inequality
|S(jω)| ≤ 5, ∀ω.
Suppose that, in addition to stability robustness, we want to design the controller C(s) so that
the system tracks signals r(t) up to, say, 1 rad/s. Thus we want, say,
|S(jω)| ≤ 0.1, ∀ω ≤ 1.
This allows at most 10 percent tracking error for sinusoidal reference signals. Therefore we want
the magnitude Bode plot of S to lie under the dashed line:
5
1
1
|S(jω)|
0.1
To handle these two specs together it is convenient to construct a weighting function W (s) such
that |W (jω)| ≈ 10 over the frequency range [0, 1] and |W (jω)| ≈ 0.2 over the frequency range
[1, ∞). Then the two specs become one: kW Sk∞ ≤ 1. For computational reasons, we want W (s) to
be rational. So its magnitude can’t be discontinuous, and we need some transition from magnitude
10 to magnitude 0.2. To keep things simple, let’s try the weighting function
αs + 1
W (s) = γ .
βs + 1
For |W | = 10 at ω = 0, we take γ = 10. Then for |W | = 0.2 at ω = ∞, we need

α
10 = 0.2.
β
Finally, for |W | = 1 at ω = 1, we need

αj + 1
10
= 1.
βj + 1
These conditions give

p
α = 99/300 = 0.574, β = 50α = 28.7.
To recap, we have arrived at the problem of designing a controller C(s) that stabilizes P (s) and
achieves the inequality kW Sk∞ ≤ 1, where
0.574s + 1
W (s) = 10 .
28.7s + 1
The problem may not be solvable. That is, there may not be any stabilizing controller such that
kW Sk∞ ≤ 1. If so, we have to compromise somehow; relax either the tracking error or the stability
margin or both. This is a great feature of this way of doing control design: We can sensibly make
tradeoffs.
The obvious problem at hand is to minimize kW Sk∞ over all C(s) that stabilize P (s). If this
minimum is less than or equal to 1, our specifications are feasible. However,
W
WS =
1 + PC
is a nonlinear function of C(s), and moreover C(s) is constrained to stabilize. We need to change
the optimization parameter. Notice that the P (s) in our example is strictly proper and belongs to
H∞ .
Lemma 9.1 A proper rational controller C(s) stabilizes a strictly proper P ∈ H∞ iff it has the
form
Q
C= , Q ∈ H∞ .
1 − PQ
Proof (Necessity) Suppose C stabilizes. Let Q equal the transfer function from r to u:
C
Q= .
1 + PC
Solve for C to get C = Q/(1 − P Q).
(Sufficiency) Suppose C is given by the formula in the lemma. Then all closed-loop transfer
functions belong to H∞ . For example, the transfer function from r to y equals
PC
= P Q.
1 + PC
Also, the sensitivity function S equals 1−P Q. And so on for all other closed-loop transfer functions.
The lemma changes the problem of minimizing kW Sk∞ over C to the problem
min kW (1 − P Q)k∞ .
Q∈H∞
This is rather better because W (1 − P Q) is an affine function of Q. Let us write

1 − 0.5s 1
P = P1 P2 , P1 (s) = , P2 (s) = .
1 + 0.5s 2s + 1
Notice that |P1 (jω)| = 1 for all ω. Therefore for every Q in H∞
kW (1 − P Q)k∞ = kW P1 (P1−1 − P2 Q)k∞ = kW (P1−1 − P2 Q)k∞ = kW P1−1 − W P2 Qk∞ .
Let L∞ (jR) denote the space of proper transfer functions that have no poles on the imaginary axis.
Then F := W P1−1 belongs to L∞ (jR), while W P2 Q belongs to H∞ . Thus the minimum of kW Sk∞
over all stabilizing controllers seems to be very close to the distance from F to H∞ . The gap arises
from the fact that the set
{W P2 Q : Q ∈ H∞ }
is a proper subset of H∞ because W P2 is strictly proper. That is, if X is the function in H∞ that
is closest to F and if we get Q from W P2 Q = X, then Q may not be proper. This can be rectified
by a high-frequency correction.
9.3 The Nehari Problem

So we are given a function R(s) in L∞ (jR) and we want to find X(s) in H∞ that minimizes the
norm
kR − Xk∞ := sup |R(jω) − X(jω)|.

ω
That is, we want to find an X in H∞ that is closest to R in the infinity norm. There are two
ways to view this problem: 1) R is unstable and X has to be stable, while both are causal; 2) R is
noncausal and X has to be causal, while both are stable.
The second way turns out to be more useful. To view the problem in this way, we have to
suppose R and X are two-sided Laplace transforms, e.g.,
Z ∞
R(s) = r(t)e−st dt.
−∞
The region of convergence is taken to include the imaginary axis, so that the underlying system is
stable. Thus for R(s) the ROC must be Re s < 1. Therefore the time-domain equation giving rise
to R(s) must be
Z ∞
y(t) = r(t − τ )u(τ )dτ.
−∞
The time-domain operator
Λr : u 7→ r ∗ u, L2 (R) −→ L2 (R),
called the Laurent operator derived from r, is equivalent to the frequency-domain operator
U 7→ RU, L2 (jR) −→ L2 (jR)
in the sense that they have equal induced norms, since the Fourier transform is norm-preserving,
by Theorem ??. The norm of the latter operator equals kRk∞ = 1.
Likewise for X: For X(s) the ROC must include the imaginary axis. The time-domain equation
giving rise to X(s) must be
Z ∞
y(t) = x(t − τ )u(τ )dτ.
−∞
The Laurent operator
Λx : u 7→ x ∗ u, L2 (R) −→ L2 (R)
is equivalent to the frequency-domain operator
U 7→ XU : L2 (jR) −→ L2 (jR).
The norm of the latter operator equals kXk∞ .

The difference between the two systems is that Λx is causal while Λr is not. Let us extract the
“noncausal” part of Λr by taking the input to start at time 0 and looking at the output only before
then:
Z ∞
y(t) = r(t − τ )u(τ )dτ, t ≤ 0.
0
This operator, L2 [0, ∞) −→ L2 (−∞, 0], is called the Hankel operator derived from r, denoted
Γr . It maps the future into the past. On the other hand, since the system with transfer function X
is causal, its Hankel operator Γx equals 0.
Notice that a Hankel operator is a piece of a Laurent operator. Thus kΛr k ≥ kΓr k. Notice also
that
kR − Xk∞ = kΛr − Λx k.
From these two facts, we get
kR − Xk∞ = kΛr − Λx k ≥ kΓr − Γx k = kΓr k.
Our original problem was to minimize kR − Xk∞ . We’ve seen that a lower bounded for this norm
is kΓr k. However, Nehari’s theorem says the lower bound is tight:
Theorem 9.3 The distance from R in L∞ (jR) to H∞ equals kΓr k. Moreover, the distance is
achieved (there is an optimal X).
So it remains to compute the norm kΓr k and then to compute the optimal X.
9.4 Hankel Operators

We may as well suppose R(s) is strictly proper, rational, with all poles in Re s < 0. Then it has a
state a state model
R(s) = C(sI − A)−1 B,

where A is antistable (all eigenvalues in Re s > 0). Suppose A is n × n. Such R belongs to L∞ (jR).
The inverse two-sided Laplace transform of R(s) is
−CeAt B, t < 0
(
r(t) =
0, t≥0
The Hankel operator Γr maps a function u in L2 [0, ∞) to the function y in L2 (−∞, 0] defined
by
Z ∞
y(t) = r(t − τ )u(τ )dτ, t < 0,
0
that is,
Z ∞
y(t) = −CeAt e−Aτ Bu(τ )dτ, t < 0.
0
Define two auxiliary operators: the controllability operator

Z ∞
Ψc : L [0, ∞) → C , Ψc u := −
2 n
e−Aτ Bu(τ )dτ
0
and the observability operator
Ψo : Cn → L2 (−∞, 0], (Ψo x)(t) := CeAt x, t < 0.
Then
Γr = Ψo Ψc .
Thus we have the diagram

Laurent
Λr
2
L (R) L2 (R)
Γr
L2 [0, ∞) L2 (−∞, 0]
Hankel
Ψc Ψo
Cn
Since kΓr k = kΨo Ψc k, it remains to compute the latter norm.
The self-adjoint operators Ψc Ψ∗c and Ψ∗o Ψo map Cn to itself. Thus they have matrix repre-
sentations with respect to the standard basis on Cn . Define the controllability and observability
gramians
Z ∞
T
Lc := e−At BB T e−A t dt (9.1)
0
Z ∞
T
Lo := e−A t C T Ce−At dt (9.2)
0
It is routine to show that Lc and Lo are the unique solutions of the Lyapunov equations
ALc + Lc AT = BB T (9.3)
AT Lo + Lo A = C T C (9.4)
We state without proof this fact: The norm of Ψo Ψc equals the square root of the norm of
(Ψo Ψc )∗ (Ψo Ψc ), and this in turn equals the largest eigenvalue of the matrix Lc Lo .
Example Let’s complete the example from the first section. We have
0.574s + 1 1 + 0.5s 0.736
F (s) = 10 = + (a function in H∞ ),
28.7s + 1 1 − 0.5s 1 − 0.5s
and so
F (s) = R(s) + (a function in H∞ ),
where
1.47
R(s) = − .
s−2
A state model for R(s) is
A = 2, B = 1, C = −1.47.
The Lyapunov equations (9.3), (9.4) yield
Lc = 1/4, Lo = 0.541.
Thus
p
kΓr k = Lc Lo = 0.368.
Thus the distance from F to H∞ equals 0.368. Our design specs are therefore easily feasible.
We omit the construction of X and a controller that meets the specs. MATLAB has tools to
design controllers based on the approach in this chapter.
9.5 Problems
1. Let U (s) = s/(s + 1). Suppose G ∈ H∞ and we want to approximate it by U V for some
V ∈ H∞ , that is, we want to minimize kG − U V k∞ . In general we can’t take V = G/U
because G/U has a pole at s = 0 unless it’s cancelled by a zero of G, so G/U is not in H∞ .
Thus in general the error norm kG − U V k∞ can be made only arbitrarily small, and not zero,
by suitable choice of V .
(a) Write in proper logic notation (using ∀ and ∃ where appropriate) the mathematical
statement of this: “Let G belong to H∞ . Then the norm kG − U V k∞ can be made
arbitrarily small by suitable choice of V in H∞ .” In this statement U is given and fixed
and should not be quantified.
(b) Write in proper logic notation the negation of your logic statement in part (a).
(c) Convert the preceding logic statement into a natural sounding sentence or sentences in
words.
(d) Write in proper logic notation the mathematical statement of this: “In general, kG −
U V k∞ cannot be made equal to zero.”
2. In the scalar-valued case prove that RL2 equals the set of all real-rational functions that are
strictly proper and have no poles on the imaginary axis.
3. Show that Ψc is surjective if (A, B) is controllable and that Ψo is injective if (C, A) is observ-
able.
4. Show that the adjoints of Ψc and Ψo are as follows:
Ψ∗c : Cn → L2 [0, ∞)
T
(Ψ∗c x)(t) = −B T e−A t x, t≥0
Ψ∗o : L (−∞, 0] → C
2 n
Z 0
T
Ψ∗o y = eA t C T y(t)dt.
−∞
5. Prove that the matrix representations of Ψc Ψ∗c and Ψ∗o Ψo are Lc and Lo respectively.
Epilogue
So where do we stand now in 2010? Let’s review and try to draw some conclusions.
1. The three classical topics, calculus of variations, the maximum principle, dynamic program-
ming, were included for historical interest.
The brachistochrone problem is beautiful, isn’t it? Find a curve that optimizes a scalar
quantity, the time to slide down.
The maximum principle is very general, and can include many kinds of constraints. However,
for many problems I don’t think a practical solution has been provided by the necessary
condition. Try some problem harder than time-optimal control of the double integrator. For
example, try a cart-pendulum system with the problem of swinging up the pendulum in
minimum time. The state space is R4 and so the switching set is a 3D hypersurface. It’s hard
to compute this hypersurface, and then how are you going to implement the controller?
Dynamic programming is indeed very powerful and the HJB equations are very important
and have played, and continue to play, an important role in optimal control.
2. I love the function space method. The reason is that it seems perfectly suited to systems
theory. A system is a function that maps an input to an output, that is, a system is a
mapping from one set to another. So right from the get-go one is into block diagrams, spaces
of signals, and operators. This is, in my view, the cleanest and clearest way to formulate a
problem. The subsystems may have differential equation models, but those are just special
ways of modeling maps.
3. The H2 and H∞ optimization methods are widely used in control design.
. . . (I haven’t finished this.)
137

Optimal PDF

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Optimal PDF

Загружено:

Авторское право:

Доступные форматы

ECE1655 Optimal Control

Course notes, Revised September 7, 2010

5 The Maximum Principle 67

II More Recent Theories 79

7 Introduction to Function Spaces 83

9 H∞ Optimal Control 124

8. Many interesting problems can be formulated as distance problems. As an example, given

V = {y; (∃x)y = Ax}.

1.2 What We Study

1.3 Theorems and Proofs

3. Write the truth tables for (P ∧ Q) ⇒ R and P ∧ (Q ⇒ R).

4. Consider the differential equation ẋ = f (x), where x is a vector. Assume x = 0 is an

(∀ε > 0)(∃δ > 0)(∀x(0))kx(0)k < δ ⇒ (∀t ≥ 0)kx(t)k < ε.

Write these two logic statements in contrapositive form.

(∃y ∈ Z)x = 2y.

(a) Not every integer is even.

11. For each of the following statements, state if it is true or false.

(a) A discrete-time signal u[k] that converges to zero is bounded.

(a) Suppose x is a variable in an optimization problem, P (x) is the proposition that x is

(∃x)P (x) =⇒ Q(x)

(∃x)[P (x) =⇒ Q(x)]

2.1 Brief Review

2.2 Eigenvalues and Eigenvectors

Example Consider two carts and a dashpot like this:

The characteristic polynomial of A is s3 (s + 3), and therefore

σ(A) = {0, 0, 0, −3}.

Proof The initial-value problem

And for the differential equation,

x(t) = c1 eλ1 t v1 + · · · + cn eλn t vn .

This is called a modal expansion of x(t).

Let’s say x(0) = (0, 1). The equation

x(0) = c1 v1 + c1 v2 = 2< (c1 v1 ),

where < denotes real part.

Suppose x(0) = (0, 1). Then c1 = j/2, c2 = −j/2 and

2.3 The Jordan Form

are equivalent to the matrix equation

that is, AV = V AJF , where

The latter equations are trivial to solve:

wi (t) = eλi t wi (0), i = 1, . . . , n.

2. All its eigs are 0.

3. Its characteristic polynomial is sn .

Example Suppose A is 3 × 3 and A = 0. Then of course it’s already in Jordan form,

Example Here we do an example of transforming a nilpotent matrix to Jordan form. Take

Take any vector v5 in Ker A4 = R5 that is not in Ker A3 , for example,

v4 = Av5 , v3 = Av4 , v2 = Av3 .

v4 = (0, 0, 0, 1, −1) ∈ Ker A3 , 6∈ Ker A2

v3 = (0, 1, 0, 0, 0) ∈ Ker A2 , 6∈ Ker A

Assemble v1 , . . . , v5 into the columns of V . Then

This is block diagonal, like this:

To simplify notation, suppose n = 3. Letting r = s − λ, we have

where each star can be 0 or 1, and hence the Jordan form of A is

det(sI − A) = (s − λ1 )m1 · · · (s − λp )mp .

2.4 The Transition Matrix

On computing the transition matrix

Continuing in this way gives

eAJF t = eλt eN t

via Laplace transforms Taking Laplace transforms of

X(s) = (sI − A)−1 x0 .

x(t) = etA x0 , X(s) = (sI − A)−1 x0

Here’s an example with complex eigenvalues:

Example Consider the cart-spring-damper system

Defining x = (y, ẏ), we have ẋ = Ax with

Assume M > 0 and K, D ≥ 0. If D = K = 0, the eigenvalues are {0, 0} and A is a nilpotent

eAJF t = eλt eN t