Академический Документы
Профессиональный Документы
Культура Документы
Background
Environmental models increasing in size and complexity
In many nonlinear problems (e.g. climate, atmospheric, oceanographic analysis,
subsurface transport, etc.) small-scale variability can have large scale consequences
This creates need to resolve large range of time and space scales (fine grids, extensive
coverage)
Data sets are also increasing in size and diversity (new in situ and remote sensing instruments,
better communications, etc.).
Need for automated methods to merge model predictions and measurements data
assimilation
Goal is to provide accurate descriptions of environmental conditions -- past, present, and future.
Important example: numerical weather prediction
Data Assimilation as an Optimization Problem
Basic objective is to obtain a physically consistent estimate of uncertain environmental variables
-- fit model predictions to data.
6
Similar to least-squares problem solved with Gauss-Newton, except problem size (perhaps 10
7
Minimize F ( ) =
1
2
=1
Measurement error
term
[ z h ( xt ( ) )]l [W z ]lm [ z h ( xt ( ) )] m +
1
[ ]l [W ]lm [ ]m
2
Prior information
(regularization) term
Such that:
xt + 1 = g t ( x t , )
t = 0,..., T 1
x0 = ( )
Indicial notation is used for matrix and vector products.
This generalized version of the least-squares objective includes a regularization term that
penalizes deviations of from a specified first guess parameter value .
State equation is a differential constraint similar to those considered in Lecture 11. However,
imbedding or response matrix methods described in Lecture 11 are not feasible for very large
problems.
Variational/Adjoint Solutions
Very large nonlinear least-squares problems (e.g. data assimilation problems) are often solved
with gradient-based quasi-Newton (e.g. BFGS) or conjugate-gradient methods.
Key task in such iterative solution methods is computation of the objective function gradient
vector dF ( ) / d at the current iterate = k .
Find gradient by using a variational approach. Incorporate state equation equality constraint
and its initial condition with Lagrange multipliers t ; t = 0,..., T .
Minimization of the Lagrange-augmented objective is the same as minimization of F() since
Lagrange multiplier term is identically zero.
F ( ) =
1
2
+
=1
T 1
t =0
1
[ ]l [W ]lm [ ]m
2
[ z h ( xt ( ) )]l [W z ]lm
=1
T 1
g t , l ( x t , )
t =0
x m
h , m ( xt ( ) )
xt ( ), p
dx t , m
dx t ( ), p + [ ]l [W ]lm d m
g t , l ( x t , )
m
d m ] + 0, l [ dx 0, l
l ( )
d m ]
m
The differentials of the state as well as the parameter appear since the state depends indirectly on
the parameter through the state equation and its initial condition.
In order to identify the desired gradient collect coefficients of each differential:
dF ( ) =
M
h , m ( xt ( ) )
g ( x , )
t +1, l t , l t
dxt , p
i , t ( ) [ z h ( xt ( ) )]l [W z ]lm
,
t
p
p
(
),
i = 0 =1
T 1
T 1
g t , l ( xt , )
l ( ) T 1
+
m
m
t =0
t =0
Here
1 if i = t ( )
selects measurement times included in the model time step sum.
0 otherwise
i, t ( ) =
T 1
t =0
t =0
This gives:
dF ( ) =
M
h , m ( xt ( ) )
g t , l ( x t , )
z
h
x
W
[
(
)]
[
]
i , t ( )
t +1, l
t , p dx t , p
t ( ) l z lm x
x t , p
t ( ), p
i =0
=1
T 1
g t , l ( xt , )
l ( ) T 1
+ T , l dxT , l + [ ]l [W ]lm 0, l
t +1, l
d m
m
m
t =0
We seek the total derivative dF ( ) / d rather than the partial derivative F ( ) / with xt
fixed (since we wish to account for the dependence of dxt on d).
To isolate the effect of d select the unknown t so the coefficient of dxt is zero.
This t satisfies the following adjoint equation:
t , p = t +1, l
g t , l ( xt , )
xt , p
h , m ( xt ( ) )
M
t , t ( ) [ z h ( xt ( ) )]l [W z ]lm
x p
=1
T , p = 0
This difference equation is solved backward in time (t = T-1, , 1, 0), from the specified
terminal condition T = 0 to the initial value 0 , much like the dynamic programming
backward recursion.
The measurement residual term in brackets acts as a forcing for the adjoint equation.
g t , l ( xt , )
+ forcing defines a tangent linear model.
The equation t , p = t +1, l
xt , p
When t satisfies the adjoint equation the desired objective function gradient is:
g t , l ( x t , )
( ) T 1
dF ( )
= [ ]l [W ]lp 0, l l
t +1, l
p
p
d p
t =0
When the prior weighting is small or is near the objective gradient is approximately equal
to 0 .
Example:
Scalar linear state equation (AR1 process) with uncertain initial condition:
xt +1 = g t ( xt , ) = xt + ut t = 0,..., T 1
x0 = ( ) =
, , and u t are given.
Measurement equation:
z = xt ( ) + v
Weights:
W z , = W = 1
-1
xt = +
t
t ju j
t = 0,..., T
j =1
T = 0