Вы находитесь на странице: 1из 15

State Space Models, Kalman Filter and Smoothing

The idea that any dynamic system can be expressed in a particular representation called the
state space representation was proposed by Kalman. He presented an algorithm (or) a set of rules
to sequentially forecast and update a set of projections of the unknown state vector.
State space representation of a dynamic system The general case
State space models were originally developed by control engineers to represent a dynamic system
or dynamic linear models. Interest normally centers on a (m 1) vector of variables, called a state
vector, that may be signals from a satellite or the actual position of a missile or a rocket. A state
vector represents the dynamics of the process. More precisely, it retains all the memory in the
process. All the dependence between past and future, must funnel through the state vector. The
elements of the state vector may not have any specific economic meaning, but state space approach is
popular in economic applications involving modelling unobserved or latent variables, like permanent
income, NAIRU (Non-Acclerating Inflation Rate of Unemployment), expected inflation, state of the
economy in business cycle analysis, etc. In most cases such signals are not observable directly, but
such a vector of variables is related to a (n 1) vector zt of variables that are actually observed,
through an equation called measurement equation or the observation equation, given by

z t = A t xt + Y t t + Nt (1)

where Yt and At are parameter matrices of order (n m) and (n k) respectively, xt is (k 1)


vector of exogeneous or pre-determined variables, and Nt is (n 1) vector of disturbances which has
zero mean and covariance matrix Ht .
Although the state vector t is not directly observable, its movements are assumed to be governed
by a well defined process, called the transition equation or state equation given by,

t = Tt t1 + Rt t , t = 1, . . . , T. (2)

where Tt and Rt are matrices of order (m m) and (m g) respectively. t is (g 1) vector of


disturbances with mean zero and covariance matrix Qt .
Remarks:
1. Note that in the measurement equation we have an added disturbances term Nt . We need it iff
we assume that what we have observed is contaminated by an additional noise; otherwise we
simply have
zt = At xt + Yt t . (3)

2. In those cases where we allow additional noise to be a part of the measurement equation,
we assume that disturbances in the measurement and transition equations are mutually and
serially uncorrelated at all time periods. Additionally they are uncorrelated with the initial
state vector 0 . We summarize these assumptions as:
    
Nt Ht 0
WN 0, , t = 0, 1, . . . , T (4)
t 0 Qt

and
E(0 t ) = 0, E(0 Nt ) = 0, t = 1, . . . , T (5)

3. xt is a (k 1) vector of predetermined or exogeneous variables. This means they may contain


lagged values of z as well as variables that are uncorrelated with t and Nt , also.

1
4. The equations and assumptions about the error vectors are generally used to describe a finite
series of observations {z1 , z2 , . . . , zT } for which we need some assumptions about the initial
values of the state vector, 1 . We therefore assume that 1 is uncorrelated with any realizations
of t and Nt . That is,
E( t 1 ) = 0 for t = 1, . . . , T, (6)
E(Nt 1 ) = 0 for t = 1, . . . , T. (7)
Additionally we also assume that t is uncorrelated with lagged values of t . That is,
E( t 0 ) = 0 for = t 1, t 2, . . . , 1. (8)
Similarly,
E(Nt 0 ) = 0 for = 1, 2, . . . , T (9)
E(Nt z0 ) = E[Nt (At xt + Yt t + Nt )0 ]
= 0 for = t 1, t 2 . . . , 1 (10)
0
E( t z ) = 0 for = t 1, t 2 . . . , 1. (11)
These assumptions have been made to make the system quite flexible. Any assumption can be
relaxed and the results generalized.
5. Notice that we have not mentioned anything about the size of the matrices as well as the
dimension of the state vector. Suffice it to say at this point, that the dimensions have to be
large enough so that the dynamics of the systems can be captured by the simple first order
Markov structure of the state equation. From a technical point of view, the aim of state space
form is to set up t such that, it has as small number of elements as possible. Such a state
space set up is called a minimal realization and it is a basic criterion for a good state space
form.
6. In many cases of interest only one observation is available in each time period that is zt
is now a scalar in the state equation. Also, the transition matrix is much simpler than given
before, in the sense, the parameters in most cases, including the variance, are assumed to be
time invariant. Thus the transition equation now becomes
t = Tt1 + R t , t = 1, . . . , T. (12)
and
t = WN(0, 2 Q). (13)
7. For many applications using Kalman filter, the vector of exogenous variables is simply not
necessary. One may also assume that the variance of the noise term is time invariant. So that,
the general system now boils down to:
zt = yt0 t + Nt , t = 1, . . . , T (14)
t = Tt1 + R t , t = 1, . . . , T. (15)
zt now is a scalar and Nt (0, 2 h) and yt0 is a (1 m) vector. In some of the state
space applications, especially those that use ARM A models the measurement error in the
observation equation i.e. Nt is assumed to be zero. This means that Nt in such applications
will be absent.
8. There are many ways to write a given system in state-space form. But written in any way, if
our primary interest is forecasting, we would get identical forecasts no matter which form we
use. Note also that, we can write any state space form as an ARM A model. This way, there
is an equivalence between the two forms.

2
Examples of state space representation:
Example 1: First let us consider the general ARM A(p, q) model and see how it can be cast in a
state space form. An ARM A(p, q) model can be written, by defining m =max(p, q + 1), in the form:

zt = 1 zt1 + 2 zt2 + + m ztm + 1 et1 + 2 et2 + + m1 etm+1 + et

where we interpret j = 0 for j > p and j = 0 for j > q.


Then we can write the state and observation equation as follows:
State equation:
..
1 . 1
. 1
2 .. Im1



.. ..
..
t = t1 + . et

. .
..
... ... ... .


. m1
m .. 00
Observation equation:
 
zt = 1 0 . . . 0 t .
The original model can be easily recovered by repeated substitution, starting at the bottom row of
the state equation. We can easily note that the first element of the state vector is identically equal
to the given model for zt .
Example 2: Let us consider next a univariate AR(p) process:

zt = 1 zt1 + 2 zt2 + + p ztp + et

where (B) = (1 1 B 2 B 2 p B p ) is the AR operator and et is white noise. This could


be defined in state space form by writing the (m 1) state vector, t , where m = p for the present
case, as follows:
State equation:
..
1 . 1
. 0
2 .. Im1



.. ..
..
t = t1 + . et

. .
..
... ... ... .


. 0
m .. 00
Observation equation:
 
zt = 1 0 . . . 0 t .
Defining t = (1t 2t . . . mt )0 , and substituting from the bottom row, we get the original AR
model.
Example 3: Let us consider the following ARM A(1, 1) model. For this model m = 2.

zt = 1 zt1 + 1 et1 + et .

For this model the state and the measurement equations are given below:

3
  
1 1 1
StateEquation : t = t1 + et and
0 0 1
 
ObservationEquation : zt = 1 0 t .
If we define t = (1t 2t )0 , then
2t = 1 et
1t = 1 1,t1 + 2,t1 + et
= 1 zt1 + 1 et1 + et .
and this is precisely the original model.

Example 4: As a final example, we shall consider the first order moving average model, assuming
that the model has zero mean:
zt = et + 1 et1 .
Here m = 2, so that the state and the measurement equations are given as follows:

   
0 1 1
State equation: t = t1 + et and
0 0 1
 
Observation equation: zt = 1 0 t .
If we define t = (1t 2t )0 , then 2t = 1 et and 1t = 2,t1 + et = et + 1 et1 and this is precisely
the original model.

We have seen before that there are many ways of writing a given system in state space form. We
shall here give an example of writing the AR(p) process in a different way.

Example 5: As before let m = p. The state equation is given as: State Equation:
zt 1 2 . . . p1 p zt1 1

zt1 1 0 . . . 0 0 zt2 0
. = . .. .. + . et
.. .. . . ..
ztp+1 0 ... ... 1 0 ztp 0
| {z
} | {z
}| {z
} | {z
}

y y y y
t T t1 R
Observation Equation:

zt

  zt1
(zt ) = 1 0 . . . 0 ..
| {z
} .

y ztp+1
| {z
}
0
yt
y
t
In this case, by carrying out the matrix multiplication on RHS of the state equation, we can
notice that the first row gives the original AR model and the rest are trivial identities, including the
observation equation.

4
Example 6: Let us take the ARM A(p, q) that we have seen before:

zt = 1 zt1 + 2 zt2 + + m ztm + 1 et1 + 2 et2 + + m1 etm+1 + et

where we interpret j = 0 for j > p and j = 0 for j > q. We shall re-write it in a way different
from what we saw in Example 1. Let m =max(p, q + 1). Then we can write the state equation and
observation equation as follows:
State Equation:

1 2 . . . m1 m et+1

1 0 ... 0 0 0
t+1 = .
.. . t + .
.. ..
0 0 ... 1 0 0
Observation Equation:
 
zt = + 1 1 , . . . , m1 t .
We shall take the ARM A(1, 1) model and see how to write the state space form as given Example
6 and retrieve the original model. For ARM A(1, 1), m = 2. So the state and the observation
equations are:

StateEquation :
   
1 0 1
t+1 = t + et+1 ,
1 0 0
ObservationEquation :
 
zt = + 1 1 t .

Starting from the second row of the state equation, we have

2,t+1 = 1,t .
First row of state equation implies that

 = 1 1,t + et+1
 1,t+1
or 1 1 B 1,t+1 = et+1 . . . . (1)

Observation equation states that

zt = + 1,t + 1 2,t
= + 1,t + 1 1,t1

= + 1 + 1 B 1,t . . . (2)

Multiply (2) 1 1 B to give:

   
1 1 B zt = 1 1 B 1 + 1 B 1,t

= 1 + 1 B et [from (1)]
= (which is the given model)

5
Example 7: Let us take an example of state space formulation for an economic problem. Fama and
Gibbons (Journal of Monetary Economics, 1982,9, pp.297-32) use the state space idea to study the
behaviour of ex-ante real interest rate (defined as the nominal interest rate, it , minus the expected
inflation rate, te .) This is unobservable because we do not have data on the anticipated rate of
inflation. Thus, the state variable is:
t = it te ,
where is the average of ex-ante real interest rate. Fama and Gibbons assume that the ex-ante real
interest rate follows the AR(1) process:

t+1 = t + et+1 .

But an econometrician has data on ex-post real interest rate (that is, nominal interest rate, it minus
the actual rate of inflation, t .) That is,

it t = it te + te t
 

= + t + t ,

where t = te t , is the error agents made in forecasting inflation. If people forecast optimally,
then t should be free of autocorrelation and should be uncorrelated with ex-ante real interest rate.

Kalman Filter An Overview


Consider the system given by the following equations:

zt = yt0 t + Nt , t = 1, . . . , T
t = Tt1 + Rt , t = 1, . . . , T.

Given this, our objectives could be either to obtain the values of unknown parameters or, given
the parameter vectors, we may be aiming to obtain the linear least squares forecasts of the state
vector on the basis observed data. Kalman filter (KF here after) has many uses. We are utilising
it as an algorithm to evaluate the components of the likelihood function. Kalman filtering follows
a two-step procedure. In the first step, the optimal predictor for the next observation is formed,
based on all the information currently available. This is done by the prediction equation. In the
second step, the moment a new observation becomes available, it is then incorporated into the
estimator of the state vector using the updating equation. These two equations collectively form
the Kalman filter equations. Applied recursively, the KF provides an optimal solution to the twin
problems of prediction and updating. Assuming that the observations are normally distributed and
also assuming that current estimator of the state vector is the best available, the prediction and the
updating estimators are the best. By best, we mean the estimators have the minimum mean squared
error (MMSE). It is very evident that the process of predicting the next observation and updating it
as soon as the actual value becomes available, has an interesting by product the prediction error.
And we have seen in the chapter on estimation, that how a set of dependent observation can be
decomposed in terms of the prediction errors. KF gives us a natural mechanism to carry out this
decomposition.

6
Kalman filter recursions Main equations
We shall use at to denote the M M SE estimator of t based on all information up to and including
the current observation zt. Similarly, we have at/t1 as the M M SE estimator of t at time t 1.
That is, at|t1 = E(t |It1 ).
Prediction:
At time t1, all available information, including zt1 is incorporated in at1 , which is the M M SE
estimator of t1 . The prediction error has a covariance matrix of e2 Pt1 . More precisely,

e2 Pt1 = E (t1 at1 ) (t1 at1 )0


 

From
t = Tt1 + R t ,
we get that at time t 1, the M M SE estimator of t is given by

at/t1 = Tat1

so that the estimation error or the sampling error is given by



t at/t1 = T (t1 at1 ) + R t .

The right hand side of this estimation error has zero expectations. We have to note here that an
estimator is unconditionally unbiased (u-unbiased) if its
h estimation error has zero expectations. And
0 i
when an estimator is u-unbiased its MSE matrix, E t at|t1 t at|t1 is identical to the
h  0 i
covariance matrix of the estimation error, at|t1 t at|t1 t . And hence we can write the
the covariance of the estimation error as:
    0 
 0
E t at/t1 t at/t1 = E T (t1 at1 ) + R t T (t1 at1 ) + R t
   
0 0
= TE (t1 at1 ) (t1 at1 ) T + TE (t1 at1 ) t R0 + 0

   
0
RE t (t1 at1 ) T + RE t t R0
0 0

= 2 TPt1 T0 + 2 RQR0 .

Thus,

t at/t1 W S 0, 2 Pt/t1
 

where
Pt/t1 = TPt1 T0 + RQR0

where WS stands for wide sense. Weak stationarity is sometimes referred to as wide sense
stationarity.

Given that at/t1 is M M SE of t at time t 1, the M M SE of zt at time time t 1 clearly is,

zt/t1 = yt0 at/t1 .

The associated prediction error is

zt zt/t1 = t = yt0 t at/t1 + Nt


 

7
the expectation of which is zero. Hence,
h 0 i
var t = E(t2 ) = E yt0 t at/t1 t at/t1 yt + E(Nt2 )


[since cross product terms have zero expectations]


= 2 yt0 Pt|t1 yt + 2 h = 2 ft

Deriving the state updating equations is involved and hence we state only the main equations
below:
Updating equation

at = at|t1 + Pt|t1 yt zt yt0 at|t1 /ft .




And the estimation error is said to be


(t at ) WS(0, 2 Pt )
where
Pt = Pt|t1 Pt|t1 yt yt0 Pt|t1 /ft where ft = yt0 Pt|t1 yt + h.
We have to highlight the following points.
1. Note the role played by the prediction error, t = zt yt0 at|t1 and the variance associated

with it, 2 ft .

2. And note also the term, (m 1) vector, Pt|t1 yt /ft , which is called the Kalman gain.
3. In the discussion so far, we have assumed the presence of an additional noise in the measurement
equation; that is, h > 0. But we also have to note that, in our examples of state space
representation of ARM A models, we have assumed that the measurement equation has no
additional error. That is, Nt is assumed to be zero, implying h, the variance of the measurement
error term, will be zero. However this should not matter, since through these adjustments note
that we have isolated h as an additive scalar, which when becomes zero, does not affect our
calculations. (Note the expression for ft .)
-
ML Estimation of ARM A models
Literature has many algorithms aimed at simplifying the computation of the components of the
likelihood. One approach is to use the Kalman filter recursions. Other useful algorithms are by
Newbold (Biometrica, 1974, Vol.61, 423-26) and the innovations algorithm, suggested by Ansley
(Biometrica, 1979, Vol.66,59-65).
KF recursions are useful for a number of purposes. But our emphasis will be on understanding
how these recursions (1) can be used to construct linear least squares forecasts of the state vector on
the basis of data observed through time t, and (2) use the resulting prediction error and its variance
to build the components of the likelihood function. In our derivation so far, we have motivated the
discussion on Kalman filter in terms of linear projection of the state vector, t and the observed times
series zt . These are linear forecasts and are optimal among any function, if we assume that the state
vector and the disturbances are multivariate Gaussian. Our main aim is to see how KF recursions
calculate these forecasts recursively, generating a1 | 0 , a2 | 1 , . . . , aT | T 1 , and P1|0 , P2|1 , . . . , PT |T 1 in
succession.
How do we start the recursions?
To start the recursions, we need to get a1|0 . This means we should get the first period forecast of
based on an information set. Since we dont have information on the zeroth period, we take the
unconditional expectation as
a1|0 = E (1 ) ,

8
where the associated estimation error has zero mean and covariance matrix 2 P1|0 .

Let us explain this with the help of an example.

Example 8: Let us take the simplest M A(1) model.

zt = et + 1 et1

We have shown before that the state vector is simply


 
zt
t =
1 et

and hence    
z1 0
a1|0 = E = .
1 e1 0
And the associated variance matrix of the estimation error, 2 P0 or 2 P1/0 , is simply E(1 01 ), so
that we have,

P1|0 = 2 E (1 01 )
    
2 z1   1 + 12 1
= E z1 1 e1 =
1 e1 1 12

While one can work out by hand the covariance matrix for the initial state vector for pure M A
models, this turns out to be too tedious for higher order mixed models. So, we need a closed form
solution to calculate this matrix. We get such a solution by generalising this. Generalisation is easy
if we can make prior assumptions about the distribution of the state vector.
Two categories of state vector can be distinguished depending on whether or not the state vector
is covariance stationary. If it is so, then the distribution of the state vector is readily available; and
with that the problem of starting values can be easily resolved. With the assumption that the state
vector is covariance stationary, one can easily check from the state equation that the unconditional
mean of the state vector is zero. That is, from the state equation, one can easily see that

E(t ) = 0,

and the unconditional variance of t is easily seen to be,

E (t 0t ) = E (Tt1 + R t ) (Tt1 + R t )0
 

Let us denote the LHS of the above expression as . Noting that the state vector depends on shocks
only up to t 1, we get
= TT0 + RQR0
Though this can be solved in many ways, a direct closed form solution is given by the following
matrix lemma.
We use the vec operator and use the following result.
Proposition: Let A, B and C be matrices such that the product ABC exists. Then

vec ABC = C0 A vec B .


  

9
Thus, we vectorize both sides of the expression for and rearrange to get a closed form solution
as,
 1
vec () = Im2 (T T) vec (RQR0 )

What this implies is that,


 provided the process is covariance stationary, Kalman filter recursions
can be started with a1 | 0 = 0, and the (m m) matrix P1 | 0 , whose elements can be expressed as
a column vector, is obtained from:
 1
vec P1|0 = Im2 (T T) vec (RQR0 )


The best way to get a grasp of the Kalman recursions, is to try them out on a simple model. Let
us try them on the simple M A(1) model.

Example 9: Assume for convenience that the process has zero mean. So, the M A(1) model can
be written as,
zt = et + 1 et1 .
Here m = 2. So from Example 3, we have the state and the measurement equations given as follows:

   
0 1 1
State equation: t = t1 + et and
0 0 1
 
Observation equation: zt = 1 0 t .

Note that the observation equation has no error. How do we start the recursions? Recall from the
prediction equation that we have to first get at|t1 . That is, for the first period, we need to get
a1|0 , the initial state vector. From our discussion about covariance stationary properties of the state
vector, it is clear that that
a1|0 = Ta0 = 0.
Next we have to calculate the matrix of the estimation error, i.e. 2 P1|0 or 2 P0 . Though we have
a formula to calculate the such matrices, for the present problem one can find it directly:

P1|0 = P0 = 2 E 1 01

  
2 z1
= E {z1 1 e1 }
1 e1
 
(1 + 12 ) 1
= .
1 12

Let us calculate the prediction error for z1 . One can easily see that z1|0 = 0, and hence the associated
prediction error 1 = z1 itself and the prediction error variance is given as:

   
2 1
2
1 01

var (1 ) = 1 0 E
0
  
1 + 12 1 1
= 2 [1 0]
1 12 0
= 2 (1 + 12 ), with f1 = (1 + 12 ).

10
Application of the updating formula:
   
(1 + 12 ) 1 1
a1 = z1 (1 + 12 )
1 12 0
 
(1 + 12 )z1
= (1 + 12 )
z1 1
 
z1
= 
z1 1 (1 + 12 )
Similarly,
     
(1 + 12 ) 1 (1 + 12 ) 1 1
P1 = 1 0
1 12 1 12 0
 
(1 + 12 ) 1
(1 + 12 )
1 12
 
0  0
=
0 14 (1 + 12 )
Prediction equation for 2 :
a2|1 = Ta1
  
0 1 z1
= 
0 0 z1 1 (1 + 12 )
  
z1 1 (1 + 12 )
= .
0
And,
     
0 1 00 0 0 1 1
P2|1 =  +
0 0 4
0
1 (1 + 1 )2
1 0 1 12
 4 2
  
1 (1 + 1 ) 0 1 1
= +
0 0 1 12
(1 + 12 + 14 )

1
= (1 + 12 ) .
2
1 1
Predicting z2 :
   
z1 1 (1 + 12 )
z2 = 1 0
0
= z1 1 (1 + 12 )


Prediction error 2 :
2 = z2 z1 1 (1 + 12 ),


and
(1 + 12 + 14 )

   
1 1
f2 = 1 0 (1 + 12 ) 0
1 12
  
1
= (1 + 12 + 14 ) (1 + 12 )

1
0
= (1 + 12 + 14 )/(1 + 12 )

11
These steps show that, for the M A(1) model, one can calculate the prediction error and its variance
using the following recursions:
1 t1
t = zt , t = 1, 2, . . . , T, where 0 = 0, and
ft1
12t
ft = 1 + 2(t1)
1 + 12 + + 1
Note here that the expressions for the prediction error t and the prediction error variance ft are
exactly the same as those obtained using triangular factorization for the M A(1) model.
-
As a last step towards finalising the likelihood function, we shall note the following further
simplification. Recall that we had decomposed the likelihood for set of dependent observations, into
a likelihood for the independent errors, using the concept of prediction error decomposition, as:
T T
T T 1X 1 X
logL(z) = log2 log 2 logft 2 t2 /ft .
2 2 2 t=1 2 t=1

From our derivation, we can see that the t and ft do not depend on 2 and hence we can concen-
trate 2 out. This means, we have to differentiate the log-likelihood with respect to 2 and get an
estimator for 2 , say, 2 . So we get,
T
2 1 X t2
= .
T t=1 ft
Evaluating the log-likelihood in terms of 2 = 2 and simplifying, we get
T
T  1X T
Log L (z)c = log 2 + 1 log ft log 2 .
2 2 t=1 2
We either maximize this log likelihood or minimize,
T
X
Log L (z)c = log ft + T log 2 .
t=1

One can make an initial guess about the underlying parameters and either apply the numerical
estimation procedures to calculate the derivatives or analytically calculate the derivatives by differ-
entiating the Kalman recursions. In either case one has to keep in mind the restrictions to be imposed
on the parameters, especially on the M A parameters, to take care of the identification problem.
-
Kalman Smoothing
We have motivated the discussion on kalman filter so far as an algorithm for predicting the
state vector, obtaining exact finite sample forecasts, as a linear function of past observations.
We have also shown, how the resulting prediction error and the prediction error variance, can
be used to evaluate the loglikelihood.
This is sub-optimal if we are interested in estimating the sequence of states. In many cases,
kalman filter is used to obtain an estimate of the state vector itself. For example, in their model
of the business cycle, Stock and Watson show how one may be interested in knowing the state
of the economy or the phase of the business cycle the economy is in, which is unobservable
at any given historical point. Stock and Watson suggest that comovements in many macro
aggregates have a common element, which may be called the state of the economy and this is
unobservable. They motivate the use of kalman filter to obtain an estimate of this unobserved
state of the economy.

12
Sometimes elements of the state vector are even interpreted as estimates of missing observations,
which could be higher frequency data points from an observable lower frequency one or simply
an estimate of missing data point. For example, if we have data on a macro aggregate from
1955 through 2104, we may interested in obtaining an estimate of 1970 which may be missing.
Or, we may be interested in extracting monthly data from quarterly data.
Such estimates of the unobserved state of the economy or missing observations can be obtained
from smoothed estimates of the state vector, t .
There are basically three forms of smoothing for a linear model. Fixed point smoothing com-
putes smoothed estimates of the state vector at some fixed point in time. That is, we obtain
a |t for any particular value of at all time periods t > . Fixed lag smoothing computes
smoothed estimates for a fixed delay, that is atj|t for j = 1, . . . , M where M is some lag. Fixed
interval smoothing computes smoothed estimates of the entire state vector, for a fixed span of
data. Of these, fixed point and fixed interval smoothing techniques more often used. All three
techniques are closely linked to kalman filter.
We shall discuss only the fixed interval smoother, which is more popular.

Fixed interval smoother

Each step of the kalman recursions gives an estimate of the state vector, t , given all current and
past observations. But an econometrician should use all available information to estimate the
sequence of states. Kalman smoother provides these estimates. The only smoothed estimator
which utilises all the sample observations is given by

at|T = E(t |IT )

and the M SE of this smoothed estimate is denoted

Pt|T = E (t at|T )(t at|T )0 .


 

The smoothing equations start from at|T and Pt|T and work backwards.
The expressions for at|T and Pt|T , which may be called the smoothing algorithm, are given
below without proof:

at + Pt at+1|T Tt+1 at

at|T =
 0
Pt|T = Pt + Pt Pt+1|T Pt+1|t Pt
where
Pt = Pt T0t+1 P1
t+1|t , t = T 1, . . . , 1
with aT |T = aT and PT |T = PT .

A set of direct residuals can also be obtained from the smoothed estimators.

et = zt yt0 at|T , t = 1, . . . , T
This is not to be confused with the prediction residuals, t , defined earlier.

13
We shall explain the smoothing algorithm with an example. Consider the simple model

zt = t + t , t W N (0, 2 )
t = t1 + t , t W N (0, 2 q)

where the state, t , and the observation, zt , are scalars. The state, which follows a random walk
process, cannot be observed directly as it is contaminated by noise. This is the simple signal plus
noise model. We assume that q is known. Also note that in this example we have allowed the
observation zt to be measured with error, t . For this example, note that T = 1, R = 1 and yt0 = 1.
The prediction equations for this example are

at|t1 = at1 , Pt|t1 = Pt1 + q


and the updating equations are

at = at|t1 + Pt|t1 (zt at|t1 )/(Pt|t1 + 1)


and
2
Pt = Pt|t1 Pt|t1 /(Pt|t1 + 1)

We shall demonstrate how to predict, update and smooth with 4 observations: z1 = 4.4, z2 =
4, z3 = 3.5 and z4 = 4.6. The initial state vector has the property, 0 N (a0 , 2 P0 ) and we have
been given that a0 = 4, P0 = 12 and q = 4 so that RQR0 = 4 and h = 1.

From the prediction equation we have a1|0 = 4, and P1|0 = 16, so that from the updating
equations we have,

a1 = 4 + (12 + 4)(4.4 4)/(12 + 4 + 1) = 4.376


and
P1 = 16 162 /17 = 0.941

Since yt0 = 1 in the measurement equation for all t, M M SLE of zt is always at|t1 . So, z2|1 =
a2|1 = a1 = 4.376.
Repeating the calculations for t = 2, 3 and 4, we get the following results:

Smoothed estimators and residuals

t 1 2 3 4
zt 4.4 4.0 3.5 4.6
at 4.376 4.063 3.597 4.428
Pt 0.941 0.832 0.829 0.828
t 0.400 -0.376 -0.563 1.003
at|T 4.306 4.007 3.739 4.428
Pt|T 0.785 0.710 0.711 0.828
et 0.094 0.007 -0.239 0.172

From the above table we also have: a2|1 = 4.376, P2|1 = 4.941, a3|2 = 4.063, P3|2 = 4.832, a4|3 =
3.597 and P4|3 = 4.829.

14
From the table, the final estimates are seen to be a4 = 4.428 and P4 = 0.828.
These values can now be used in the smoothing algorithm. And the algorithm, for the current
example reduces to,


at|T = at + Pt /Pt+1|t at+1|T at
2 
Pt|T = Pt + Pt /Pt+1|t Pt+1|T Pt+1|t , t = T 1, . . . , 1

Since a4|4 = a4 and P4|4 = P4 , we can apply the smoothing algorithm to obtain smoothed
estimates for a3|4 and P3|4 and work backwards. So we have

a3|4 = 3.597 + (0.829/4.829)(4.428 3.597) = 3.379


P3|4 = 0.829 + (0.829/4.829)2 (0.828 4.829) = 0.711

The rest of the smoothed estimates have been displayed in the table above.
The smoothed estimates of the unobserved state vector is displayed by the row at|T in the table
above.
Both the direct and the prediction error residuals have been calculated using the formulae,
et = zt at|T and t = zt at1 respectively.

15

Вам также может понравиться