Lecture3 PDF

Filtering and Identification
Lecture 3: Stochastic least squares

and Applications
Michel Verhaegen and Jan-Willem van Wingerden

1/28
Delft Center for Systems and Control

Delft University of Technology
The 3-key problems in sc4040
• Estimate linear static models from data:
min ky − F xk2W
x
Relevant and to introduce the numerical fundaments of this course!
2/28

min ky − F xk2W
x

• Estimate the state of an LTI state space model:
x(k + 1) = Ax(k) + Bu(k) + Ke(k) e(k) ∼ (0, Re )

y(k) = Cx(k) + Du(k) + e(k)
given the model (A, B, C, D) and K and input-output data.
2/28

min ky − F xk2W
x

• Estimate the state of an LTI state space model:
x(k + 1) = Ax(k) + Bu(k) + Ke(k) e(k) ∼ (0, Re )

y(k) = Cx(k) + Du(k) + e(k)
given the model (A, B, C, D) and K and input-output data.

• “Identify” state space model matrices (A, B, C, D, K) from
input-output data {u(k), y(k)}N
k=1 .
2/28

Overview
• Main Point Lecture 2
• The Stochastic Least Squares Problem
• Updating least squares estimates: The RLS
scheme
• A constrained Least squares problem.
3/28

Main Point Day Lecture 2
The Gauss-Markov Theorem
Given F ∈ RN ×n , L ∈ RN ×N , y ∈ RN , in the WLS
problem:
min ǫT ǫ subject to: y = F x + Lǫ ǫ ∼ (0, I)

x
with x deterministic and L invertible,

W = (LLT )−1 , then,
argminxe=M y E[(x − x
e)(x − x
e) T
] = (F T W F )−1 F T W y
= argminxe=M y ǫT ǫ
4/28

Overview
• Main Point Day 1
scheme
5/28

Extension: x RV
Distribution Room Temp
(T ) at 2 time instances:
x
x
Legend:
x | current unknown T
x | mean value
— | 1 − σ uncertainty
| ellipsoid
6/28

Extension: x RV
Distribution Room Temp Statistical “Prior” information on x:
x ∼ (x, P ) P ≥0
x
x . − 12 (x−x)T P −1 (x−x)
[In Gaussian setting: fx = e ]
Statistical Meaning:
Legend: E[x] = x E[(x − x)(x − x)T ] = P

x | mean value
— | 1 − σ uncertainty
| ellipsoid
6/28

Extension: x RV
Distribution Room Temp Statistical “Prior” information on x:
x ∼ (x, P ) P ≥0
x
x . − 12 (x−x)T P −1 (x−x)
[In Gaussian setting: fx = e ]
Statistical Meaning:
Legend: E[x] = x E[(x − x)(x − x)T ] = P

New measurement about x:
x | mean value
— | 1 − σ uncertainty y = F x+Lǫ ǫ ∼ (0, I) E[(x−x)ǫT ] = 0
| ellipsoid How to combine these info?
6/28

Laboratory Demo 1: Adaptive Optics
7/28

Fusion of Wavefront sensor data
At time instant k we have the measure-
ments:
Frozen Wavefront Φ(x, y) at time instant k
 
i Φ(x
∂Φ(xi , yj ) i+1 , yj )
h
+L′ ǫ
 
∂Φ(x,y) = 1 −1 
 ∂x  ∂x Φ(xi , yj )
∂Φ(x,y)
∂y
Φ(x, y) for i, j = 1, · · · , N . These can be stored
into (LSQ),
y(k+1) = F Φ+Lǫ(k+1) ǫ(k+1) ∼ (0, I)
Prior info: Knowledge of the nature of

x the turbulence gives models about:
y
E[ΦΦT ] = CΦ
How to use CΦ in the solution of LSQ?
8/28

The Stochastic Least Squares (SLS) Problem
Given the prior on the RV x ∼ (x, P ) with P ≥ 0 and given
the observations:
y = F x + Lǫ ǫ ∼ (0, I) E[(x − x)ǫT ] = 0
with F, L deterministic L square and invertible. Then seek

among the linear estimators:
9/28

The Stochastic Least Squares (SLS) Problem
Given the prior on the RV x ∼ (x, P ) with P ≥ 0 and given
the observations:
y = F x + Lǫ ǫ ∼ (0, I) E[(x − x)ǫT ] = 0
with F, L deterministic L square and invertible. Then seek

among the linear estimators:
 
h i y
e= M N  
x
x
The unbiased minimum variance estimate, i.e.:

E[(x − x e)T ]
e)(x − x x] = x
is minimized E[e
9/28

Solution to the SLS Problem
Theorem SLS: Let the conditions of the SLS problem hold,
let W = (LLT )−1 then the solution x̂ to SLS is
x̂ = Ky + (I − KF )x
with K = P F T (F P F T + W −1 )−1 and covariance matrix:

E[(x − x̂)(x − x̂)T ] = (P − P F T (F P F T + W −1 )−1 F P )
10/28

Solution to the SLS Problem
Theorem SLS: Let the conditions of the SLS problem hold,
let W = (LLT )−1 then the solution x̂ to SLS is
x̂ = Ky + (I − KF )x
with K = P F T (F P F T + W −1 )−1 and covariance matrix:

E[(x − x̂)(x − x̂)T ] = (P − P F T (F P F T + W −1 )−1 F P )
If P > 0, we can rewrite K as (P −1 + F T W F )−1 F T W and

the covariance matrix as:
E[(x − x̂)(x − x̂)T ] = (P −1 + F T W F )−1
10/28

Sketch of the Proof SLS solution
1. The error of a linear estimator:
x̃ = M F x+M Lǫ+N x ⇒ x− x̃ = (I −M F )x−M Lǫ−N x
11/28

2. Unbiased Estimator:
E[x − x̃] = (I − M F )x − N x = 0 ⇒ M F + N = I
⇒ x − x̃ = (I − M F )(x − x) − M Lǫ
11/28

E[x − x̃] = (I − M F )x − N x = 0 ⇒ M F + N = I
⇒ x − x̃ = (I − M F )(x − x) − M Lǫ
3. Covariance matrix: (W −1 = LLT ) E[(x − x̃)(x − x̃)T ]
= (I − M F )E[(x − x)(x − x)T ](I − M F )T + M LE[ǫǫT ]LT M T

= (I − M F )P (I − M F )T + M W −1 M T
11/28

E[x − x̃] = (I − M F )x − N x = 0 ⇒ M F + N = I
⇒ x − x̃ = (I − M F )(x − x) − M Lǫ
3. Covariance matrix: (W −1 = LLT ) E[(x − x̃)(x − x̃)T ]
= (I − M F )E[(x − x)(x − x)T ](I − M F )T + M LE[ǫǫT ]LT M T

= (I − M F )P (I − M F )T + M W −1 M T
12/28

Minimizing the Covariance Matrix
E[(x − x̃)(x − x̃)T ] = (I − M F )P (I − M F )T + M W −1 M T
  
h i P PFT I
= I −M    
F P F P F T + W −1 −M T
| {z }
Q
Using Lemma 2.3 p. 19, we can factorize Q as:

  
I P F T (F P F T + W −1 )−1 P− P F T (F P F T + W −1 )−1 F P 0
   [•]T
0 I 0 F P F T + W −1
Therefore,
E[(x − x̃)(x − x̃)T ] = (P − P F T (F P F T + W −1 )−1 F P )+
(P F T (F P F T + W −1 )−1 − M )(F P F T + W −1 )(•)T

13/28

Solution SLS — first part Theorem
The minimizing variance unbiased estimator is
given by:
x̂ = M y + (I − M F ) x
| {z }
N
with the optimal M given as

P F T (F P F T + W −1 )−1 = K
The minimal covariance matrix is:
E[(x−x̃)(x−x̃)T ] = (P −P F T (F P F T +W −1 )−1 F P )
14/28

Solution SLS — second part Theorem
Using the matrix inversion lemma:
(A + BCD)−1 = A−1 − A−1 B(C −1 + DA−1 B)−1 DA−1 , we
can write K as :
K = P F T (W −1 + F P F T )−1

= P F T W − W F (P −1 + F T W F )−1 F T W

= P In − F T W F (P −1 + F T W F )−1 F T W

= P (P −1 + F T W F ) − F T W F (P −1 + F T W F )−1 F T W
= (P −1 + F T W F )−1 F T W = K
The two gains K are equivalent on paper. What about

inside the computer?
Provided P is invertible! 15/28

Numerical Example
Consider the stochastic least squares problem with:
h i
x = 0 P = 107 I3 L = 10−6 I3 xT = 1 −1 0.1
The matrices P and L indicate that the prior information is

inaccurate and the relationship about x in y (the measurement)
is very accurate. The matrix F is generated by the Matlab
command:
F = gallery(’randsvd’,3);
kx̂−xk2
Then the sample average of the relative error kxk2 for 100 trials with
1. K = P F T (F P F T + W −1 )−1 is 0.0483
2. K = (P −1 + F T W F )−1 F T W is 0.1163
16/28

Recall the second part of the Theorem SLS
Theorem: Let the conditions of the SLS problem hold, let

W = (LLT )−1 and P > 0 then the solution x̂ to SLS is,
x̂ = Ky + (I − KF )x
with K = (P −1 + F T W F )−1 F T W and covariance matrix:

E[(x − x̂)(x − x̂)T ] = (P −1 + F T W F )−1
17/28

Interpretation results of Theorem SLS
1. E[(x − x̂)(x − x̂)T ] = (P −1 + F T W F )−1 is the resulting

covariance matrix of fusing the two estimates:
x ∼ (x, P ) x ∼ ((F T W F )−1 F T W y, (F T W F )−1 )
18/28


2. x̂ = Ky + (I − KF )x with K = (P −1 + F T W F )−1 F T W
indicates that: For P −1 → 0 and ⇒ x̂ = x̂WLS no
matter what x is.
18/28


2. x̂ = Ky + (I − KF )x with K = (P −1 + F T W F )−1 F T W
indicates that: For P −1 → 0 and ⇒ x̂ = x̂WLS no
matter what x is.
3. It lays the fundaments for Recursive Least Squares (RLS).

Prior x ∼ (x̂k−1 , Pk−1 ) with x̂k−1
UMVE
New Data:
yk = Fk x + Lk ǫ k SLS A Posteriori x ∼ (x̂k , Pk )
ǫk ∼ (0, I)
18/28

Overview
• Updating least squares estimates: The
RLS scheme
19/28

The RLS Algorithm
Given initial estimate of the unknown RV x:
x ∼ (x̂1 , P1 )
For k = 1:end,
read data (yk , Fk , Wk )
Kk = (Pk−1 + FkT Wk Fk )−1 FkT Wk
x̂k+1 = (I − Kk Fk )x̂k + Kk yk
−1
Pk+1 = (Pk−1 + FkT Wk Fk )
end
Caution: This (information-matrix version) is not a tractable implementation from a nu-
merical point of view.
20/28

RLS_demo.m
21/28

Overview
scheme
22/28

A constrained least squares problem
Frozen  Φ(x, y) at time instant k
 Wavefront
∂Φ(x,y)
 ∂x 
∂Φ(x,y)
∂y
Φ(x, y)
x
y
The reconstruction of Φ was formulated

as a least squares problem (LSQ):
min ǫT ǫ s.t y = F Φ + Lǫ
Φ
with prior knowledge E[ΦΦT ] = Cφ .
23/28

A constrained least squares problem
Frozen  Φ(x, y) at time instant k The least squares problem (LSQ) is split
 Wavefront
∂Φ(x,y)
 ∂x  into independent, small least squares
∂Φ(x,y)
∂y problems that are solved distributed!
Φ(x, y)
Fragmented Wavefront
dF
x Φ̂(x, y)
y
The reconstruction of Φ was formulated

as a least squares problem (LSQ): x
y
min ǫT ǫ s.t y = F Φ + Lǫ
Φ
Key question: How to connect these in-

with prior knowledge E[ΦΦT ] = Cφ .
dependent solutions together?
23/28

Example: Two segment LSQ
Consider the two segment Least
squares problem:
4
y(x)
0
0 1 2 3 4 5
x
24/28

squares problem:
4
y(x)
0
0 1 2 3 4 5
x
For x ∈ [0, 2[:
y(x) = a1 + b1 x + ǫ1
For x ∈ [2, 5[:
y(x) = a2 + b2 (x − 2) + ǫ2
24/28

Independent LSQ problems:
squares problem:
For segment 1:
4
y(x)
3
   
y(0) 1 0
2     
 y(.4)  1 .4 
    a1 2
1 min k 
..  −  ..
     k2
a1 ,b1 
 .  .  b1

0
0 1 2 3 4 5
x
   
y(1.6) 1 1.6
For x ∈ [0, 2[: For segment 2:
y(x) = a1 + b1 x + ǫ1
 
a2
min ky2 − F2   k22
a2 ,b2 b2
For x ∈ [2, 5[:
y(x) = a2 + b2 (x − 2) + ǫ2
24/28

Independent LSQ problems:
squares problem:
For segment 1:
4
y(x)
3
   
y(0) 1 0
2     
 y(.4)  1 .4 
    a1 2
1 min k 
..  −  ..
     k2
a1 ,b1 
 .  .  b1

0
0 1 2 3 4 5
x
   
y(1.6) 1 1.6
For x ∈ [0, 2[: For segment 2:
y(x) = a1 + b1 x + ǫ1
 
a2
min ky2 − F2   k22
a2 ,b2 b2
For x ∈ [2, 5[:
y(x) = a2 + b2 (x − 2) + ǫ2 Problem: Segments no longer con-
nected!
24/28

Connecting the two segments
squares problem:
4
y(x)
2
The segments are connected provided:
1
a1 + b1 2 = a2 ⇒ 0 = a1 + 2b1 − a2
0
0 1 2 3 4 5
x
This is an LSQ with equality constraints:
For x ∈ [0, 2[:    
y1 F1 0
min k  −   xk22 s.t. Hx = 0
y(x) = a1 + b1 x + ǫ1 x y2 0 F2
For x ∈ [2, 5[:
y(x) = a2 + b2 (x − 2) + ǫ2
25/28

Least squares problems with equality constraints
min ǫǫ y = F x + Lǫ ǫ ⌢ (0, I)
subject to: 0 = Hx .
How to tackle this problem?
26/28

Least squares problems with equality constraints
min ǫǫ y = F x + Lǫ ǫ ⌢ (0, I)
subject to: 0 = Hx .
How to tackle this problem? (Engineering Approach:) Treat the
equality constraints as very accurate measurements (!):
0 = Hx + L′ µ µ ⌢ (0, I)
for L′ “very small” or consider L′ = αI

1
0 = Hx + µ
α
And we end with the augmented LSQ:
     
y F ǫ
2 2
minx kǫk2 + kµk2   =   x+ 

1
0 αH µ
26/28

splinedemo.m and splinedemo2.m
27/28

Preparation and next lecture
Preparation:
Study Chapter 4 (4.5.3-4.5.5)
Download Homework 2
Next lecture:
Lecture 4: Kalman Filtering
Wednesday November 30, 2016, 15.30-17.30
Keep your eyes focussed on “Guide-

lines/Rules/Schedule sc4040: 2016-
2017” for the correct deadlines!
28/28

Lecture3 PDF

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Lecture3 PDF

Загружено:

Авторское право:

Доступные форматы

Filtering and Identification

Lecture 3: Stochastic least squares

Michel Verhaegen and Jan-Willem van Wingerden

Delft Center for Systems and Control

Relevant and to introduce the numerical fundaments of this course!

Delft Center for Systems and Control

Relevant and to introduce the numerical fundaments of this course!

x(k + 1) = Ax(k) + Bu(k) + Ke(k) e(k) ∼ (0, Re )

given the model (A, B, C, D) and K and input-output data.

Delft Center for Systems and Control

Relevant and to introduce the numerical fundaments of this course!

x(k + 1) = Ax(k) + Bu(k) + Ke(k) e(k) ∼ (0, Re )

given the model (A, B, C, D) and K and input-output data.

Delft Center for Systems and Control

Delft Center for Systems and Control

min ǫT ǫ subject to: y = F x + Lǫ ǫ ∼ (0, I)

with x deterministic and L invertible,

Delft Center for Systems and Control

Delft Center for Systems and Control

Delft Center for Systems and Control

Legend: E[x] = x E[(x − x)(x − x)T ] = P

Delft Center for Systems and Control

Legend: E[x] = x E[(x − x)(x − x)T ] = P

Delft Center for Systems and Control

Delft Center for Systems and Control

y(k+1) = F Φ+Lǫ(k+1) ǫ(k+1) ∼ (0, I)

Prior info: Knowledge of the nature of

How to use CΦ in the solution of LSQ?

Delft Center for Systems and Control

with F, L deterministic L square and invertible. Then seek

Delft Center for Systems and Control

with F, L deterministic L square and invertible. Then seek

The unbiased minimum variance estimate, i.e.:

Delft Center for Systems and Control

with K = P F T (F P F T + W −1 )−1 and covariance matrix:

Delft Center for Systems and Control

with K = P F T (F P F T + W −1 )−1 and covariance matrix:

If P > 0, we can rewrite K as (P −1 + F T W F )−1 F T W and

Delft Center for Systems and Control

Delft Center for Systems and Control

Delft Center for Systems and Control

= (I − M F )E[(x − x)(x − x)T ](I − M F )T + M LE[ǫǫT ]LT M T

Delft Center for Systems and Control

= (I − M F )E[(x − x)(x − x)T ](I − M F )T + M LE[ǫǫT ]LT M T

Delft Center for Systems and Control

Using Lemma 2.3 p. 19, we can factorize Q as:

(P F T (F P F T + W −1 )−1 − M )(F P F T + W −1 )(•)T

Delft Center for Systems and Control

with the optimal M given as

Delft Center for Systems and Control

The two gains K are equivalent on paper. What about

Delft Center for Systems and Control

The matrices P and L indicate that the prior information is

Delft Center for Systems and Control

Theorem: Let the conditions of the SLS problem hold, let

with K = (P −1 + F T W F )−1 F T W and covariance matrix:

Delft Center for Systems and Control

1. E[(x − x̂)(x − x̂)T ] = (P −1 + F T W F )−1 is the resulting

x ∼ (x, P ) x ∼ ((F T W F )−1 F T W y, (F T W F )−1 )

Delft Center for Systems and Control

1. E[(x − x̂)(x − x̂)T ] = (P −1 + F T W F )−1 is the resulting

x ∼ (x, P ) x ∼ ((F T W F )−1 F T W y, (F T W F )−1 )

Delft Center for Systems and Control