Lect23 PDF

MASSACHUSETTS INSTITUTE OF TECHNOLOGY
Department of Civil and Environmental Engineering

1.731 Water Resource Systems
Lecture 23 Variational and Adjoint Methods, Data Assimilation

Nov. 30, 2006
Background
Environmental models increasing in size and complexity
In many nonlinear problems (e.g. climate, atmospheric, oceanographic analysis,
subsurface transport, etc.) small-scale variability can have large scale consequences
This creates need to resolve large range of time and space scales (fine grids, extensive
coverage)
Data sets are also increasing in size and diversity (new in situ and remote sensing instruments,
better communications, etc.).
Need for automated methods to merge model predictions and measurements data
assimilation
Goal is to provide accurate descriptions of environmental conditions -- past, present, and future.
Important example: numerical weather prediction
Data Assimilation as an Optimization Problem
Basic objective is to obtain a physically consistent estimate of uncertain environmental variables
-- fit model predictions to data.
6
Similar to least-squares problem solved with Gauss-Newton, except problem size (perhaps 10
7
unknowns, 10 measurements) requires a special approach.

State equation (environmental model) describes physical system.
System is characterized by a very large spatially/temporally discretized state vector xt:
xt +1 = g ( xt , ) initial state: x0 ( ) t = 0,..., T 1 = model time index
is uncertain parameter vector
Measurement equation describes how measurements assembled in measurement vector zt are
related to state:
z = h [ xt ( ) ] + v
= 1,..., M = measurement index
v is uncertain measurement error vector
t ( ) = model time step t corresponding to measurement
Procedure: Find that is most consistent with measurements and prior information.
Optimization problem: Best minimizes generalized least squares objective function:
Minimize F ( ) =
1
2
=1
Measurement error
term
[ z h ( xt ( ) )]l [W z ]lm [ z h ( xt ( ) )] m +
1
[ ]l [W ]lm [ ]m
2
Prior information
(regularization) term
Such that:
xt + 1 = g t ( x t , )
t = 0,..., T 1
x0 = ( )
Indicial notation is used for matrix and vector products.
This generalized version of the least-squares objective includes a regularization term that
penalizes deviations of from a specified first guess parameter value .
State equation is a differential constraint similar to those considered in Lecture 11. However,
imbedding or response matrix methods described in Lecture 11 are not feasible for very large
problems.
Variational/Adjoint Solutions
Very large nonlinear least-squares problems (e.g. data assimilation problems) are often solved
with gradient-based quasi-Newton (e.g. BFGS) or conjugate-gradient methods.
Key task in such iterative solution methods is computation of the objective function gradient
vector dF ( ) / d at the current iterate = k .
Find gradient by using a variational approach. Incorporate state equation equality constraint
and its initial condition with Lagrange multipliers t ; t = 0,..., T .
Minimization of the Lagrange-augmented objective is the same as minimization of F() since
Lagrange multiplier term is identically zero.
F ( ) =
1
2
+
[ z h ( xt ( ) )]l [W z ]lm [ z h ( xt ( ) )]m
=1
T 1
t +1, l [ xt +1, l g t , l ( xt , )] + 0, l [ x0, l l ( )]
t =0
Here = k , xt = xtk , and t = tk .
1
[ ]l [W ]lm [ ]m
2
Evaluate variation (differential) of objective at current iteration (generally not a minimum):

dF ( ) =
[ z h ( xt ( ) )]l [W z ]lm
=1
T 1
t +1, l [dxt +1, l
g t , l ( x t , )
t =0
x m
h , m ( xt ( ) )
xt ( ), p
dx t , m
dx t ( ), p + [ ]l [W ]lm d m
g t , l ( x t , )
m
d m ] + 0, l [ dx 0, l
l ( )
d m ]
m
The differentials of the state as well as the parameter appear since the state depends indirectly on
the parameter through the state equation and its initial condition.
In order to identify the desired gradient collect coefficients of each differential:
dF ( ) =
M
h , m ( xt ( ) )
g ( x , )
t +1, l t , l t
dxt , p
i , t ( ) [ z h ( xt ( ) )]l [W z ]lm
,
t
p
p
(
),
i = 0 =1
T 1
T 1
g t , l ( xt , )
l ( ) T 1
+
t +1, l dxt +1, l + 0, l dx 0, l + [ ]l [W ]lm 0, l

t +1, l
d m
m
m
t =0
t =0
Here
1 if i = t ( )
selects measurement times included in the model time step sum.
0 otherwise
i, t ( ) =
The dxt+1 term can be written:

T 1
T 1
t =0
t =0
t +1, l dxt +1, l = t , l dxt , l +T , l dxT , l 0, l dx 0, l
This gives:
dF ( ) =
M
h , m ( xt ( ) )
g t , l ( x t , )
z
h
x
W
[
(
)]
[
]
i , t ( )
t +1, l
t , p dx t , p
t ( ) l z lm x
x t , p
t ( ), p
i =0
=1
T 1
g t , l ( xt , )
l ( ) T 1
+ T , l dxT , l + [ ]l [W ]lm 0, l
t +1, l
d m
m
m
t =0
We seek the total derivative dF ( ) / d rather than the partial derivative F ( ) / with xt
fixed (since we wish to account for the dependence of dxt on d).
To isolate the effect of d select the unknown t so the coefficient of dxt is zero.
This t satisfies the following adjoint equation:
t , p = t +1, l
g t , l ( xt , )
xt , p
h , m ( xt ( ) )
M
t , t ( ) [ z h ( xt ( ) )]l [W z ]lm
x p
=1
T , p = 0
This difference equation is solved backward in time (t = T-1, , 1, 0), from the specified
terminal condition T = 0 to the initial value 0 , much like the dynamic programming
backward recursion.
The measurement residual term in brackets acts as a forcing for the adjoint equation.
g t , l ( xt , )
+ forcing defines a tangent linear model.
The equation t , p = t +1, l
xt , p
When t satisfies the adjoint equation the desired objective function gradient is:
g t , l ( x t , )
( ) T 1
dF ( )
= [ ]l [W ]lp 0, l l
t +1, l
p
p
d p
t =0
Start search with = .

On iteration k with = k carry out following steps:
1. Solve state equation from t = 0, , T-1, starting with initial condition x0 = ( ) .
2.
3.
4.
5.
Solve adjoint equation from t = T-1, , 0, starting with terminal condition T = 0 .

Compute objective function gradient from xt and t sequences
Take next search step
If not converged replace k with k + 1 and return to 1. Otherwise, exit.
This approach requires 2 model evaluations:

1 forward solution of the state equation
1 backward solution of the adjoint equation.
By comparison, traditional finite difference evaluation requires N+1 model evaluations
N = number of elements in xt = O(106).
Special Case: Uncertain Initial Condition
The gradient equation simplifies considerably when the only uncertain input to be estimated is
the initial condition, so x0 = ( ) = :
dF ( )
= [ ]l [W ]lp 0, p
d p
When the prior weighting is small or is near the objective gradient is approximately equal
to 0 .
Example:
Scalar linear state equation (AR1 process) with uncertain initial condition:
xt +1 = g t ( xt , ) = xt + ut t = 0,..., T 1
x0 = ( ) =
, , and u t are given.
Measurement equation:
z = xt ( ) + v
Weights:
W z , = W = 1
-1
Take 3 measurements z1 , z 2 , z 3 at times t (1) = t*, t ( 2) = 2t*, t (3) = 3t * , where t* = (1 - ) .

Start search with = .
On iteration k with = k carry out following steps:
1. Solve state equation for specified x0 = :
xt = +
t
t ju j
t = 0,..., T
j =1
2. Solve adjoint equation for t = T 1, ..., 0 :

t = t +1 + t , t * ( z1 xt * ) + t ,2t * ( z 2 x 2t * ) + t ,3t * ( z 3 x3t * )
3.Compute objective function gradient:
dF ( )
= [ ] 0
d
4. Take next search step
5. If not converged replace k with k + 1 and return to 1. Otherwise, exit.
T = 0

Lect23 PDF

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Lect23 PDF

Загружено:

Авторское право:

Доступные форматы

MASSACHUSETTS INSTITUTE OF TECHNOLOGY

Department of Civil and Environmental Engineering

Lecture 23 Variational and Adjoint Methods, Data Assimilation

unknowns, 10 measurements) requires a special approach.

Optimization problem: Best minimizes generalized least squares objective function:

[ z h ( xt ( ) )]l [W z ]lm [ z h ( xt ( ) )]m

t +1, l [ xt +1, l g t , l ( xt , )] + 0, l [ x0, l l ( )]

Here = k , xt = xtk , and t = tk .

Evaluate variation (differential) of objective at current iteration (generally not a minimum):

t +1, l [dxt +1, l

t +1, l dxt +1, l + 0, l dx 0, l + [ ]l [W ]lm 0, l

The dxt+1 term can be written:

t +1, l dxt +1, l = t , l dxt , l +T , l dxT , l 0, l dx 0, l

Start search with = .

Solve adjoint equation from t = T-1, , 0, starting with terminal condition T = 0 .

This approach requires 2 model evaluations:

Take 3 measurements z1 , z 2 , z 3 at times t (1) = t, t ( 2) = 2t, t (3) = 3t * , where t* = (1 - ) .

2. Solve adjoint equation for t = T 1, ..., 0 :

Вам также может понравиться