Академический Документы
Профессиональный Документы
Культура Документы
in Non-Life Insurance
Mario V. W
uthrich 1
Department of Mathematics
Michael Merz 2
Faculty of Economics
ETH Z
urich
University T
ubingen
Version 1.1
1 ETH
Z
urich, CH-8092 Z
urich, Switzerland.
T
ubingen, D-72074 T
ubingen, Germany.
2 University
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
Contents
1 Introduction and Notation
1.1 Claims process . . . . . . . . . . . . . . . . . . . . . .
1.1.1 Accounting principle and accident year . . . . .
1.1.2 Inflation . . . . . . . . . . . . . . . . . . . . . .
1.2 Structural framework to the claims reserving problem .
1.2.1 Fundamental properties of the reserving process
1.2.2 Known and unknown claims . . . . . . . . . . .
1.3 Outstanding loss liabilities, classical notation . . . . . .
1.4 General Remarks . . . . . . . . . . . . . . . . . . . . .
2 Basic Methods
2.1 Chain-ladder model (distribution free model) . . . .
2.2 The Bornhuetter-Ferguson method . . . . . . . . .
2.3 Number of IBNyR claims, Poisson model . . . . . .
2.3.1 Poisson derivation of the chain-ladder model
.
.
.
.
.
.
.
.
3 Chain-ladder models
3.1 Mean square error of prediction . . . . . . . . . . . . .
3.2 Chain-ladder method . . . . . . . . . . . . . . . . . . .
3.2.1 The Mack model . . . . . . . . . . . . . . . . .
3.2.2 Conditional process variance . . . . . . . . . . .
3.2.3 Estimation error for single accident years . . . .
3.2.4 Conditional MSEP in the chain-ladder model for
accident years . . . . . . . . . . . . . . . . . . .
3.3 Analysis of error terms . . . . . . . . . . . . . . . . . .
3.3.1 Classical chain-ladder model . . . . . . . . . . .
3.3.2 Enhanced chain-ladder model . . . . . . . . . .
3.3.3 Interpretation . . . . . . . . . . . . . . . . . . .
3.3.4 Chain-ladder estimator in the enhanced model .
3.3.5 Conditional process and prediction errors . . . .
3
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7
7
9
10
12
13
15
16
18
.
.
.
.
21
21
27
30
34
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
aggregated
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
39
39
41
42
46
48
59
62
63
64
65
66
67
Contents
3.3.6
3.3.7
68
75
4 Bayesian models
85
4.1 Introduction to credibility claims reserving methods . . . . . . . . . 85
4.1.1 Benktander-Hovinen method . . . . . . . . . . . . . . . . . . 86
4.1.2 Minimizing quadratic loss functions . . . . . . . . . . . . . . 89
4.1.3 Cape-Cod Model . . . . . . . . . . . . . . . . . . . . . . . . 92
4.1.4 A distributional example to credible claims reserving . . . . 95
4.2 Exact Bayesian models . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.2.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.2.2 Log-normal/Log-normal model . . . . . . . . . . . . . . . . 101
4.2.3 Overdispersed Poisson model with gamma a priori distribution108
4.2.4 Exponential dispersion family with its associate conjugates . 116
4.2.5 Poisson-gamma case, revisited . . . . . . . . . . . . . . . . . 125
4.3 B
uhlmann-Straub Credibility Model . . . . . . . . . . . . . . . . . . 126
4.3.1 Parameter estimation . . . . . . . . . . . . . . . . . . . . . . 132
4.4 Multidimensional credibility models . . . . . . . . . . . . . . . . . . 136
4.4.1 Hachemeister regression model . . . . . . . . . . . . . . . . . 137
4.4.2 Other credibility models . . . . . . . . . . . . . . . . . . . . 140
4.5 Kalman filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
5 Outlook
A Unallocated loss adjustment
A.1 Motivation . . . . . . . . .
A.2 Pure claims payments . .
A.3 ULAE charges . . . . . . .
A.4 New York-method . . . . .
A.5 Example . . . . . . . . . .
149
expenses
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
B Distributions
B.1 Discrete distributions . . . . . . . . .
B.1.1 Binomial distribution . . . . .
B.1.2 Poisson distribution . . . . . .
B.1.3 Negative binomial bistribution
B.2 Continuous distributions . . . . . . .
B.2.1 Normal distribution . . . . . .
B.2.2 Log-normal distribution . . .
B.2.3 Gamma distribution . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
151
151
152
153
153
157
.
.
.
.
.
.
.
.
159
159
159
159
160
160
160
160
161
Contents
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
Contents
Chapter 1
Introduction and Notation
1.1
Claims process
A non-life insurance policy is a contract among two parties, the insurer and the
insured. It provides to the insurer a fixed amount of money (called premium), to
the insured a financial coverage against the random occurrence of well-specified
events (or at least a promise that he gets a well-defined amount in case such an
event happens). The right of the insured to these amounts (in case the event
happens) constitutes a claim by the insured on the insurer.
The amount which the insurer is obliged to pay in respect of a claim is known as
claim amount or loss amount. The payments which make up this claim are
known as
claims payments,
loss payments,
paid claims, or
paid losses.
The history of a typical claim may look as follows:
accident date
claims payments
reporting date
reopening
claims closing
payments
claims closing
time
2. After the reporting it can take several years until a claim gets finally settled.
In property insurance we usually have a rather fast settlement whereas in
liability or bodily injury claims it often takes a lot of time until the total
degree of a claim is clear and known (and can be settled).
3. It can also happen that a closed claim needs to be reopend due to new
(unexpected) new developments or in case a relapse happens.
1.1.1
There are different premium accounting principles: i) premium booked, ii) premium
written, iii) premium earned. It depends on the kind of business written, which
principle should be chosen. W.l.o.g. we concentrate in the present manuscript on
the premium earned principle:
Usually an insurance company closes its books at least once a year. Let us assume
that we close our book always on December 31. How should we show a one-year
contract which was written on October 1 2006 with two premium installments paid
on October 1 2006 and April 1 2007?
We assume that
premium written 2006 = 100,
premium booked 2006 = 50 (= premium received in 2006),
pipeline premium 31.12.2006 = 50 (= premium which will be received in
2007), which gives premium booked 2007 = 50.
If we assume that the risk exposure is distributed uniformly over time (pro rata
temporis), this implies that
premium earned 2006 = 25 (= premium used for exposure in 2006),
unearned premium reserve UPR 31.12.2006 = 75 (= premium which will be
used for exposure in 2007), which gives premium earned 2007 = 75.
If the exposure is not pro rata temporis, then of course we have a different split
of the premium earned into the different accounting years. In order to have a
consistent financial statement it is now important that the accident date and the
premium accounting principle are compatible (via the exposure pattern). Hence all
claims which have accident year 2006 have to be matched to the premium earned
2006, i.e. the claims 2006 have to be paid by the premium earned 2006, whereas
the claims with accident year later than 2006 have to be paid by the unearned
premium reserve UPR 31.12.2006.
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
10
Hence on the one hand we have to build premium reserves for future exposures,
but on the other hand we need also to build claims reserves for unsettled claims of
past exposures. There are two different types of claims reserves for past exposures:
1. IBNyR reserves (incurred but not yet reported): We need to build claims
reserves for claims which have occurred before 31.12.2006, but which have
not been reported by the end of the year (i.e. the reporting delay laps into
the next accounting years).
2. IBNeR reserves (incurred but not enough reported): We need to build claims
reserves for claims which have been reported before 31.12.2006, but which
have not been settled yet, i.e. we still expect payments in the future, which
need to be financed by the already earned premium.
Example 1.1 (Reporting delay)
accident
year
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
0
368
393
517
578
622
660
666
573
582
545
509
589
564
607
674
619
660
660
9
0
1
0
0
0
0
0
0
0
10
1
0
1
0
0
0
0
1
Table 1.1: claims development triangle for number of IBNyR cases (source [75])
1.1.2
Inflation
11
the LoB motor hull inflation is driven by the technical complexity of car repairing
techniques. The essential point is that claims inflation may continue beyond the
occurrence date of the accident up to the point of its final payments/settlement.
If Xti denote the positive single claims payments at time ti expressed in money
value at time t1 , then the total claim amount is in money value at time t1 given by
X
C1 =
Xti .
(1.1)
i=1
If () denotes the index which measures the claims inflation, the actual claim
amount (nominal) is
X
(ti )
Xti .
(1.2)
C=
(t
)
1
i=1
Whenever is an increasing function we observe that C is bigger than C1 . Of
course, in practice we only observe the unindexed payments Xti (ti )/(t1 ) and in
general it is difficult to estimate an index function such that we obtain indexed
values Xti . Finding an index function () is equivalent in defining appropriate
deflators , which is a well-known concept in market consistent actuarial valuation,
see e.g. W
uthrich-B
uhlmann-Furrer [91].
The basic idea between indexed values C1 is that, if two sets of payments relate
to identical circumstances except that there is a time translation in the payment,
their indexed values will be the same, whereas the unindexed values are not the
same: For c > 0 we assume that
t +c = Xt .
X
i
i
For increasing we have that
X
X
t +c = C1
C1 =
Xti =
X
i
C =
X
i=1
(1.3)
(1.4)
X (ti + c)
(ti + c)
Xti +c =
Xti > C,
(t1 )
(t1 )
i=1
(1.5)
whenever is an increasing function (we have assumed (1.3)). This means that
the unindexed values differ by the factor (ti + c)/(ti ). However in practice this
ratio turns often out to be even of a different form, namely
(t + c)
i
1 + (ti , ti + c)
,
(1.6)
(ti )
meaning that over
[ti , ti + c] claim costs are inflated by an ad the time interval
12
1.2
Moreover Ci (t) = 0 for t < Ti . The total ultimate claim amount is given by
X
Ci () = Ci (Ti,Ni ) =
Xi,j .
(1.10)
j0
The total claims reserves for claim i at time t for the future liabilities (outstanding claim at time t) are given by
X
Ri (t) = Ci () Ci (t) =
Xi,j .
(1.11)
j:Ti,j >t
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
13
R(t) =
N
X
i=1
N
X
Ci (t),
(1.12)
Ri (t).
i=1
C(t) denotes all payments up to time t for all N claims, and R(t) denotes the
outstanding claims payments (reserves) at time t for these N claims.
We consider now claims reserving as a prediction problem. Let
FtN = {(Ti,j , Ii,j , Xi,j )i1,j0 : Ti,j t}
(1.13)
be the information available at time t. This -field is obtained from the information
available at time t from the claims settlement processes. Often there is additional
exogenous information Et at time t (change of legal practice, high inflation, job
market infromation, etc.). Therefore we define the information which the insurance
company has at time t by
Ft = FtN Et .
(1.14)
Problem. Estimate the conditional distributions
t () = P [C() |Ft ] ,
(1.15)
Mt = E [C()|Ft ] ,
(1.16)
Vt = Var (C()|Ft ) .
1.2.1
(1.17)
Because of
C() = C(t) + R(t),
(1.18)
we have that
def.
(1.19)
(1.20)
14
a.s.
(1.21)
a.s.
(1.22)
(1.23)
= Var (C()|Fs ) = Vs .
2
Consider u > t and define the increment from t to u by
M (t, u) = Mu Mt .
(1.24)
(1.25)
= E [M (t, u) (E [C()| Fu ] Mu )| Ft ] = 0.
This implies that M (t, u) and M (u, ) are uncorrelated, which is the well-known
property that a we have uncorrelated increments.
First approach to the claims reserving problem. Use the martingale integral
representation. This leads to the innovation gains process, which determines Mt
when updating Ft .
This theory is well-understood.
One has little idea about the updating process.
One has (statistically) not enough data.
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
15
Second approach to the claims reserving problem. For t < u we have that
Ft Fu . Since Mt is an Ft -martingale we have that
E [M (t, u)| Ft ] = 0
a.s.
(1.26)
(1.27)
(1.28)
(1.29)
1.2.2
(1.30)
As in Subsection 1.1.1 we define IBNyR (incurred but not yet reported) claims and
reported claims. The following process counts the number of reported claims,
X
Nt =
1{Ti t} .
(1.31)
i1
Hence we can split the ultimate claim and the reserves at time t with respect to
the fact whether we have a reported or an IBNyR claim by
X
X
R(t) =
Ri (t) 1{Ti t} +
Ri (t) 1{Ti >t} ,
(1.32)
i
where
X
Ri (t) 1{Ti t}
(1.33)
(1.34)
X
i
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
16
Hence we define
#
#
" N
t
X
= E
Ri (t) 1{Ti t} Ft = E
Ri (t) Ft ,
i
i=1
#
#
"
" N
X
X
= E
Ri (t) 1{Ti >t} Ft = E
Ri (t) Ft ,
"
Rtrep
RtIBN yR
(1.35)
(1.36)
i=Nt +1
(1.37)
iNt
"
RtIBN yR = E
N
X
i=Nt +1
#
Ri (t) Ft .
(1.38)
Rtrep denotes the at time t expected future payments for reported claims. This is
often called best estimate reserves at time t for reported claims. RtIBN yR are the
at time t expected future payments for IBNyR claims (or best estimate reserves
for IBNyR claims).
Conclusions. (1.37)-(1.38) shows that the reserves for reported claims and the
reserves for IBNyR claims are of rather different nature:
i) The reserves for reported claims should be determined individually, i.e. on
a single claims basis. Often one has quite a lot of information on reported
claims (e.g. case estimates), which asks for an estimate on single claims.
ii) The reserves for IBNyR claims can not be decoupled due to the fact that N
is not known at time t (see (1.36)). Moreover we have no information on a
single claims basis. This shows that IBNyR reserves should be determined
on a collective basis.
Unfortunately most of the classical claims reserving methods do not distinguish
reported claims from IBNyR claims, i.e. they estimate the claims reserves at the
same time on both classes. In that context, we have to slightly disappoint the
reader, because the most methods presented in this manuscript do also not make
this distinction.
1.3
In this subsection we introduce the classical claims reserving notation and terminology. In most cases outstanding loss liabilities are estimated in so-called claims
development triangles which separates claims on two time axis.
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
17
(1.39)
j =
(1.40)
For illustrative purposes we assume that: Xi,j denotes all payments in development
period j for claims with accident year i, i.e. this corresponds to the incremental
claims payments for claims with accident year i done in accounting year i + j.
Below, we see which other meaning Xi,j can have.
In a claims development triangle accident years are usually on the vertical line
whereas development periods are on the horizontal line (see also Table 1.1). Usually the loss development tables split into two parts the upper triangle/trapezoid
where we have observations and the lower triangle where we want to estimate the
outstanding payments. On the diagonals we always see the accounting years.
Hence the claims data have the following structure:
accident
year i
0
1
..
.
I +1J
I +2J
..
.
..
.
I +iJ
..
.
..
.
I 2
I 1
I
development years j
3
4
...
j
...
j
X
Xi,k .
(1.41)
k=0
The incremental data Xi,j may denote the incremental payments in cell (i, j), the
number of reported claims with reporting delay j and accident year i or the change
of reported claim amount in cell (i, j). For cumulative data Ci,j we often use
the terminology cumulative payments, total number of reported claims or
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
18
claims incurred (for cumulative reported claims). Ci, is often called ultimate
claim amount/load of accident year i or total number of claims in year i.
Xk =
Xi,j ,
(1.42)
i+j=k
Ri,j =
(1.43)
k=j+1
Ri,j are also called claims reserves, this is essentially the amount we have to
estimate (lower triangle) so that together with the past payments Ci,j we obtain
the whole claims load (ultimate claim) for accident year i.
1.4
General Remarks
If we consider loss reserving models, i.e. models which estimate the total claim
amount there are always several possibilities to do so.
Cumulative or incremental data
Payments or claims incurred data
Split small and large claims
Indexed or unindexed data
Number of claims and claims averages
Etc.
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
19
Usually, different methods and differently aggregated data sets lead to very different
results. Only an experienced reserving actuary is able to tell you which is an
accurate/good estimate for future liabilities for a specific data set.
Often there are so many phenomenons in the data which need first to be understood
before applying a method (we can not simply project the past to the future by
applying one model).
With this in mind we describe different methods, but only practical experience
will tell you which method should be applied in which situation. I.e. the focus
of this manuscript lies on the mathematical description of stochastic models. We
derive various properties of these models. The question of an appropriate model
choice for a specific data set is not treated here. Indeed, this is probably one
of the most difficult questions. Moreover, there is only very little literature on this
topic, e.g. for the chain-ladder method certain aspects are considered in BarnettZehnwirth [7] and Venter [77].
Remark on claims figures.
When we speak about claims development triangles (paid or incurred data), these
usually contain loss adjustment expenses, which can be allocated/attributed to
single claims (and therefore are contained in the claims figures). Such expenses
are called allocated loss adjustment expenses (ALAE). These are typically
expenses for external lawyers, an external expertise, etc. Internal loss adjustment
expenses (income of claims handling department, maintenance of claims handling
system, management fees, etc.) are typically not contained in the claims figures
and therefore have to be estimated separately. These costs are called unallocated
loss adjustment expenses (ULAE). Below, in the appendix, we describe the
New York-method (paid-to-paid method), which serves to estimate ULAE. The
New York-method is a rather rough method which only works well in stationary
situation. Therefore one could think of more sophisticated methods. Since usually,
ULAE are rather small compared to the other claims payments, the New Yorkmethod is often sufficient in practical applications.
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
20
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
Chapter 2
Basic Methods
We start the general discussion on claims reserving with three standard methods:
1. Chain-ladder method
2. Bornhuetter-Ferguson method
3. Poisson model for claim counts
This short chapter has on the one hand illustrative purposes to give some ideas,
how one can tackle the problem. It presents the easiest two methods (chain-ladder
and Bornhuetter-Ferguson method). On the other hand one should realize that in
practice these are the methods which are used most often (due to their simplicity).
The chain-ladder method will be discussed in detail in Chapter 3, the BornhuetterFerguson method will be discussed in detail in Chapter 4.
We assume that the last development period is given by J, i.e. Xi,j = 0 for j > J,
and that the last observed accident year is given by I (of course we assume (J I)).
2.1
The chain-ladder model is probably the most popular loss reserving technique. We
give different derivations for the chain-ladder model. In this section we give a
distribution-free derivation of the chain-ladder model (see Mack [49]). The conditional prediction error of the chain-ladder model will be treated in Chapter 3.
The classical actuarial literature often explains the chain-ladder method as a pure
computational alogrithm to estimate claims reserves. It was only much later that
actuaries started to think about stochastic models which generate the chain-ladder
algorithm. The first who came up with a full stochastic model for the chain-ladder
method was Mack [49]. In 1993, Mack [49] published one of the most famous
21
22
articles in claims reserving on the calculation of the standard error in the chainladder model.
Model Assumptions 2.1 (Chain-ladder model)
There exist development factors f0 , . . . , fJ1 > 0 such that for all 0 i I and
all 1 j J we have that
E [ Ci,j | Ci,0 , . . . , Ci,j1 ] = E [Ci,j | Ci,j1 ] = fj1 Ci,j1 ,
(2.1)
j1
Y
fl1
(2.2)
l=0
(2.4)
= fJ1 E [Ci,J1 | DI ] .
If we iterate this procedure until we reach the diagonal i + j = I we obtain the
claim.
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
23
2
Lemma 2.3 gives an algorithm for estimating the expected ultimate claim given
the observations DI . This algorithm is often called recursive algorithm. For known
chain-ladder factors fj we estimate the expected outstanding claims liabilities of
accident year i based on DI by
E [Ci,J | DI ] Ci,Ii = Ci,Ii (fIi fJ1 1) .
(2.5)
This corresponds to the best estimate reserves for accident year i at time I
(based on the information DI ). Unfortunately, in most practical applications the
chain-ladder factors are not known and need to be estimated. We define
j (i) = min{J, I i}
i (j) = I j,
and
(2.6)
these denote the last observations/indices on the diagonal. The age-to-age factors
fj1 are estimated as follows:
(j)
iP
fbj1 =
k=0
(j)
iP
Ck,j
.
(2.7)
Ck,j1
k=0
(2.9)
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
24
accident
year
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
0
368
393
517
578
622
660
666
573
582
545
509
589
564
607
674
619
660
660
9
0
1
0
0
0
0
0
0
0
10
1
0
1
0
0
0
0
1
(2.13)
25
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
2
10563929
10316383
10092366
9268771
9178009
9585897
9056505
8256211
1.0229
1
9668212
9593162
9245313
8546239
8524114
9013132
8493391
7728169
7648729
1.0778
1.0148
3
10771690
10468180
10355134
9459424
9451404
9830796
9282022
1.0070
4
10978394
10536004
10507837
9592399
9681692
9935753
1.0051
5
11040518
10572608
10573282
9680740
9786916
1.0011
6
11106331
10625360
10626827
9724068
1.0010
7
11121181
10636546
10635751
1.0014
8
11132310
10648192
9
11148124
8470989
8243496
9129696
8445057
8432051
9338521
9419776
8570389
8557190
9477113
10005044
9485469
8630159
8616868
9543206
9837277
10056528
9534279
8674568
8661208
9592313
9734574
9847906
10067393
9544580
8683940
8670566
9602676
10646884
9744764
9858214
10077931
9554571
8693030
8679642
9612728
10663318
10662008
9758606
9872218
10092247
9568143
8705378
8691971
9626383
Total
15126
26257
34538
85302
156494
286121
449167
1043242
3950815
6047061
Reserves
Table 2.2: Observed historical cumulative payments Ci,j and estimated chain-ladder factors fbj
0
5946975
6346756
6269090
5863015
5778885
6184793
5600184
5288066
5290793
5675568
1.4925
CL
CL
d
d
Table 2.3: Estimated cumulative chain-ladder payments C
and estimated chain-ladder reserves C
Ci,Ii
i,j
i,J
0
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
fbj
26
Chapter 2. Basic Methods
2.2
27
(2.14)
2
(2.15)
2
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
28
Often the Bornhuetter-Ferguson method is explained with the help of Model Assumptions 2.9 (see e.g. Radtke-Schmidt [63], pages 37ff.). However, with Model
Assumptions 2.9 we face some difficulties: Observe that
E [Ci,J | DI ] = E[Ci,J |Ci,0 , . . . , Ci,Ii ]
(2.16)
(2.17)
j1
Y
fk ,
(2.19)
k=0
E[Ci,J ] = E[Ci,0 ]
J1
Y
fk ,
(2.20)
k=0
which implies
E[Ci,j ] =
J1
Y
fk1 E[Ci,J ].
(2.21)
k=j
fk1
j ,
k=j
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
(2.22)
29
Q
1
since J1
describes the proportion already paid from i = E [Ci,J ] after j
k=j fk
development periods in the chain-ladder model. Therefore the two variables in
(2.22) are often identified: this can be done with Model Assumptions 2.9, but not
with Model Assumptions 2.8 (since Model Assumptions 2.8 are not implied by
the chain-ladder assumptions nor vica versa). I.e. if one knows the chain-ladder
factors fj one constructs a development pattern k using the identity in (2.22) and
vice-versa. Then the Bornhuetter-Ferguson estimator can be rewritten as follows
\ !
BF
1
d
bi .
(2.23)
C
= Ci,Ii + 1 QJ1
i,J
f
j=Ii j
On the other hand we have for the chain-ladder estimator that
J1
Y
CL
d
C
= Ci,Ii
i,J
fbj
j=Ii
J1
Y
= Ci,Ii + Ci,Ii
fbj 1
j=Ii
CL
d
C
i,J
= Ci,Ii + QJ1
b
j=Ii fj
= Ci,Ii +
J1
Y
fbj 1
j=Ii
1 QJ1
j=Ii
fbj
CL
d
C
.
i,J
(2.24)
Hence the difference between the Bornhuetter-Ferguson method and the chainladder method is that for the Bornhuetter-Ferguson method we completely believe
into our a priori estimate
bi , whereas in the chain-ladder method the a priori
CL
d
estimate is replaced by an estimate C
which comes completely from the obi,J
servations.
Parameter estimation.
For i we need an a priori estimate
bi . This is often a plan value from a strategic business plan. This value is estimated before one has any observations,
i.e. it is a pure a priori estimate.
For the still-to-come factor (1 Ii ) one should also use an a priori estimate
if one applies strictly the Bornhuetter-Ferguson method. This should be done
independently from the observations. In most practical applications here one
quits the path of the pure Bornhuetter-Ferguson method and one estimates
the still-to-come factor from the data with the chain-ladder estimators: If fbk
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
30
k=j
estimate
bi
11653101
11367306
10962965
10616762
11044881
11480700
11413572
11126527
10986548
11618437
(CL)
bIi
100.0%
99.9%
99.8%
99.6%
99.1%
98.4%
97.0%
94.8%
88.0%
59.0%
estimator
BF
d
C
i,J
estimator
CL
d
C
i,J
11148124
10664316
10662749
9761643
9882350
10113777
9623328
8830301
8967375
10443953
11148124
10663318
10662008
9758606
9872218
10092247
9568143
8705378
8691971
9626383
Total
BF
CL
reserves
reserves
16124
26998
37575
95434
178024
341305
574089
1318646
4768384
7356580
15126
26257
34538
85302
156494
286121
449167
1043242
3950815
6047061
Table 2.4: Claims reserves from the Bornhuetter-Ferguson method and the chainladder method
We already see in this example, that using different methods can lead to substantial
differences in the claims reserves.
2.3
We close this chapter with the Poisson model, which is mainly used for claim counts.
The remarkable thing in the Poisson model is, that it leads to the same reserves
as the chain-ladder model (see Lemma 2.16). It was Mack [48], Appendix A, who
has first proved that the chain-ladder reserves as maximum likelihood reserves for
the Poisson model.
Model Assumptions 2.12 (Poisson model)
There exist parameters 0 , . . . , I > 0 and 0 , . . . , J > 0 such that the incremental
quantities Xi,j are independent Poisson distributed with
E[Xi,j ] = i j ,
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
(2.26)
31
PJ
j=0
j = 1.
2
For the definition of the Poisson distribution we refer to the appendix, Section
B.1.2.
The cumulative quantity in accident year i, Ci,J , is again Poisson-distributed with
E[Ci,J ] = i .
(2.27)
Hence, i is a parameter that stands for the expected number of claims in accident
year i (exposure), whereas j defines an expected reporting/cashflow pattern over
the different development periods j. Moreover we have
j
E[Xi,j ]
= ,
E[Xi,0 ]
0
(2.28)
which is independent of i.
Lemma 2.13 The Poisson model satisfies the Model Assumptions 2.8.
Proof. The independence of different accident years follows from the independence
of Xi,j . Moreover, we have that E [Ci,0 ] = E [Xi,0 ] = i 0 with 0 = 0 and
E [ Ci,j+k | Ci,0 , . . . , Ci,j ] = Ci,j +
k
X
(2.29)
l=1
= Ci,j + i
k
X
l=1
with j =
Pj
l=0
To estimate the parameters (i )i and (j )j there are different methods, one possibility is to use the maximum likelihood estimators: The likelihood function for
DI = {Ci,j ; i + j I, j J}, the -algebra generated by DI is the same as the
one generated by {Xi,j ; i + j I, j J}, is given by
Xi,j
Y
i j (i j )
LDI (0 , . . . , I , 0 , . . . , J ) =
e
.
(2.30)
Xi,j !
i+jI
We maximize this log-Likelihood function by setting its I +J +2 partial derivatives
w.r.t. the unknown parameters j and j equal to zero. Thus, we obtain on DI
that
(Ii)J
(Ii)J
bi
bj =
j=0
i=0
Xi,j = Ci,(Ii)J ,
(2.31)
j=0
Ij
X
Ij
bi
bj =
Xi,j ,
i=0
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
(2.32)
32
P
for all i {0, . . . , I} and all j {0, . . . , J} under the constraint that
bj = 1.
This system has a unique solution and gives us the ML estimates for i and j .
Estimator 2.14 (Poisson ML estimator) The ML estimator in the Poisson
Model 2.12 is for i + j > I given by
P oi
d
b i,j ] =
X
= E[X
bi
bj ,
i,j
(2.33)
J
X
P oi
d
b [Ci,J | DI ] = Ci,Ii +
C
= E
i,J
P oi
d
X
.
i,j
(2.34)
j=Ii+1
Observe that
d
C
i,J
P oi
= Ci,Ii +
Ii
X
bj
bi ,
(2.35)
j=0
hence the Poisson ML estimator has the same form as the BF Estimator 2.10.
However, here we use estimates for i and j that depend on the data.
Example 2.15 (Poisson ML estimator)
We revisit the example given in Table 2.2 (see Example 2.7).
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
0
1
2
3
4
5
6
7
8
9
bj
58.96%
1
3721237
3246406
2976223
2683224
2745229
2828338
2893207
2440103
2357936
2
895717
723222
847053
722532
653894
572765
563114
528043
3
207760
151797
262768
190653
273395
244899
225517
4
206704
67824
152703
132976
230288
104957
5
62124
36603
65444
88340
105224
6
65813
52752
53545
43329
7
14850
11186
8924
8
11130
11646
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
188846
188555
208825
2.17%
137754
125332
125139
138592
1.44%
69291
65693
59769
59677
66093
0.69%
50361
51484
48810
44409
44341
49107
0.51%
10506
10628
10865
10301
9372
9358
10364
0.11%
11133
10190
10308
10538
9991
9090
9076
10052
0.10%
15126
15124
13842
14004
14316
13572
12348
12329
13655
0.14%
bi
11148124
10663318
10662008
9758606
9872218
10092247
9568143
8705378
8691972
9626383
15126
26257
34538
85302
156494
286121
449167
1043242
3950815
6047061
estimated reserves
9
15813
P oi
d
Table 2.6: Estimated bi , bj , incremental payments X
and Poisson reserves
i,j
594767
658707
6.84%
0
5946975
6346756
6269090
5863015
5778885
6184793
5600184
5288066
5290793
5675568
2795422
29.04%
0
1
2
3
4
5
6
7
8
9
34
Remark. The expected reserve is the same as in the chain-ladder model on cumulative data (see Lemma 2.16 below).
2.3.1
I this subsection we show that the Poisson model (Section 2.3) leads to the chainladder estimate for the reserves.
Lemma 2.16 The Chain-ladder Estimate 2.4 and the ML Estimate 2.14 in the
Poisson model 2.12 lead to the same reserve.
In fact the Poisson ML model/estimate defined in Section 2.3 leads to a chainladder model (see formula (2.39)), moreover the ML estimators lead to estimators
for the age-to-age factors which are the same as in the distribution-free chain-ladder
model.
Proof. In the Poisson model 2.12 the estimate for E [Ci,j |Ci,j1 ] is given by
bi
bj + Ci,j1 .
(2.36)
P oi
d
b [Ci,J |DI ] =
C
= E
bi
i,J
bj + Ci,Ii
j=j (i)+1
=
bi
J
X
j (i)
bj +
j=j (i)+1
Xi,j =
bi
j=0
J
X
bj ,
(2.37)
j=0
where in the last step we have used (2.31). Using (2.31) once more we find that
PJ
bj
P oi
d
b [Ci,J |DI ] = Ci,Ii P j=0
.
(2.38)
C
=E
i,J
j (i)
bj
j=0
This last formula can be rewritten introducing additional factors
PJ
bj
P oi
j=0
d
C
= Ci,Ii Pj (i)
i,J
bj
j=0
Pj (i)+1
PJ
bj
bj
j=0
j=0
= Ci,Ii Pj (i)
. . . PJ1 .
bj
bj
j=0
j=0
(2.39)
Ci,jJ =
Ij
X
i=0
bi
jJ
X
bk .
k=0
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
(2.40)
35
Ci,(j1)J =
Ij
X
Ij
X
i=0
i=0
(j1)J
bi
i=0
bk .
(2.41)
k=0
b
k
i=0 Ci,j
= fbj1 .
= PIj
Pk=0
j1
b
C
k=0 k
i=0 i,j1
(2.42)
P oi
PI(j (i)+1)
= Ci,Ii
= Ci,Ii
Ck,j (i)+1
k=0
PI(j (i)+1)
Ck,j (i)
k=0
CL
d
,
fbIi fbJ1 = C
i,J
PIJ
k=0 Ck,J
. . . PIJ
k=0 Ck,J1
(2.43)
which is the chain-ladder estimate (2.8). This finishes the proof of Lemma 2.16.
2
Lemma 2.17 Under Model Assumptions 2.12 we have on DI that
Ij
X
Ci,jJ =
i=0
Ij
X
i=0
bi
jJ
X
bk .
(2.44)
k=0
IJ
X
X0,j =
b0
j=0
IJ
X
bj ,
(2.45)
j=0
I(j1)
Ci,(j1)J =
i=0
(2.46)
i=0
Ij
X
I(j1)
Ci,jJ
i=0
Ij
Ci,jJ
i=0
i=0
Ij
Ij
i=0
Ij
Ci,jJ
i=0
k=0
Xi,j 1{jJ} +
i=0
Ij
k=0
bk
bj 1{jJ}
XIj+1,k
k=0
jJ
i=0
XIj+1,k
(j1)J
Ij
bi
jJ
X
(j1)J
bi +
bIj+1
i=0
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
X
k=0
bk .
36
Ij
X
Ci,(j1)J =
(j1)J
bi
i=0
i=0
bk +
bIj+1
k=0
I(j1)
(j1)J
bk
k=0
(j1)J
bi
i=0
bk ,
(2.47)
k=0
bk = bj
=
bk
f
k=0
k=j
Proof. From (2.38) and (2.43) we obtain for all i I J
PJ
j=0
Ci,Ii PIi
bj
bj
j=0
Since
d
=C
i,J
P oi
d
=C
i,J
CL
(2.49)
Ii
X
j=0
bj
J1
Y
j=Ii
fbj =
Ii
X
(CL)
bj bIi
1
(2.50)
j=0
bi ,
(2.51)
i,J
i,Ii
Ii
where
bi is the ML estimate given in (2.31)-(2.32).
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
37
bl = bl
bl1 =
J1
Y
k=l
1
b
1 1/fl1 ,
fbk
(2.52)
and
(Ii)J
bi =
X
j=0
(Ii)J
Xi,j /
bj .
(2.53)
j=0
Below we will see other ML methods and GLM models where the solution
of the equations is more complicated, and where one applies algorithmic
methods to find numerical solutions.
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
38
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
Chapter 3
Chain-ladder models
3.1
In the previous section we have only given an estimate for the mean/expected
ultimate claim, of course we would also like to know, how good this estimate is.
To measure the quality of the estimate we consider second moments.
Assume that we have a random variable X and a set of observations D. Assume
b is a D-measurable estimator for E[X|D].
that X
Definition 3.1 (Conditional Mean Square Error of Prediction) The condib is defined by
tional mean square error of prediction of the estimator X
2
b
b
(3.1)
msepX|D (X) = E X X D .
b we have
For a D-measurable estimator X
2
b = Var (X| D) + X
b E [X|D] .
msepX|D (X)
(3.2)
The first term on the right-hand side of (3.2) is the so-called process variance
(stochastic error), i.e. the variance which is within the stochastic model (pure
randomness which can not be eliminated). The second term on the right-hand
side of (3.2) is the parameter/estimation error. It reflects the uncertainty in the
estimation of the parameters and of the expectation, respectively. In general,
this estimation error becomes smaller the more observations we have. But pay
attention: In many practical situations it does not completely disappear, since we
try to predict future values with the help of past information, already a slight
change in the model over time causes lots of problems (this is also discussed below
in Section 3.3).
For the estimation error we would like to explicitly calculate the last term in (3.2).
However, this can only be done if E [X|D] is known, but of course this term is in
39
40
and
(3.3)
If we consider the unconditional mean square error of prediction for the estib we obtain
mator X
h
i
2
b
b
b
msepX (X) = E msepX|D (X) = Var (X) + E X E [X]
,
(3.4)
b = E[X], we have
b is an unbiased estimator for E[X], i.e. E X
and if X
h
i
b
b
b .
msepX (X) = E msepX|D (X) = Var (X) + Var X
(3.5)
b
Hence the parameter error is estimated by the variance of X.
Example. Assume X and X1 , . . . , Xn are i.i.d. with mean and variance
b = Pn Xi /n that
2 < . Then we have for the estimator X
i=1
!2
n
X
1
b = 2 +
msepX|D (X)
Xi .
(3.6)
n i=1
By the strong law of large numbers we know that the last term disappears
a.s. for n . In order to determine this term for finite n, one would like to
P
explicitly calculate the distance between ni=1 Xi /n and . However, since
in general is not known, we can only give an estimate for that distance. If
we calculate the unconditional mean square error of prediction we obtain
b = 2 + 2 /n.
msepX (X)
(3.7)
P
Henceforth, we can say that the deviation of ni=1 Xi /n around is in the
41
In all these cases the situation is even more complicated. Observe that if we
calculate the unconditional mean square error of prediction we obtain
h
i
b = E msepX|D (X)
b
msepX (X)
(3.8)
2
b E [ X| D]
= E [Var ( X| D)] + E X
2
b E [X| D]
= Var (X) Var (E [X| D]) + E X
2
b E [X]
= Var (X) + E X
h
i
b
2 E X E [X] (E [X| D] E [X]) .
b is unbiased for E [X] we obtain
If the estimator X
b
b
b
msepX (X) = Var (X) + Var X 2 Cov X, E [X| D] .
(3.9)
This again tells something on the average estimation error but it doesnt tell
b for a specific realization.
anything on the quality of the estimate X
3.2
Chain-ladder method
We have already described the chain-ladder method in Subsection 2.1. The chainladder method can be applied to cumulative payments, to claims incurred, etc. It
is the method which is most commonly applied because it is very simple, and often
using appropriate estimates for the chain-ladder factors, one obtains reliable claims
reserves.
The main deficiencies of the chain-ladder method are
The homogeneity property need to be satisfied, e.g. we should not have any
trends in the development factors (otherwise we have to transform our observations).
For estimating old development factors (fj for large j) there is only very
little data available, which is maybe (in practice) no longer representative
for younger accident years. E.g. assume that we have a claims development
with J = 20 (years), and that I = 2006. Hence we estimate with todays
information (accident years < 2006) what will happen with accident year
2006 in 20 years.
For young accident years, very much weight is given to the observations,
i.e. if we have an outlier on the diagonal, this outlier is projected right to
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
42
3.2.1
We define the chain-ladder model once more, but this time we extend the definition to the second moments, so that we are also able to give an estimate for the
conditional mean square error of prediction for the chain-ladder estimator.
In the actuarial literature, the chain-ladder method is often understood as a purely
computational algorithm and leaves the question open which probabilistic model
would lead to that algorithm. It is Macks merit [49] that he has given first an
answer to that question (a first decisive step towards the formulas was done by
Schnieper [69]).
Model Assumptions 3.2 (Chain-ladder, Mack [49])
Different accident years i are independent.
(Ci,j )j is a Markov chain with: There exist factors f0 , . . . , fJ1 > 0 and
2
variance parameters 02 , . . . , J1
> 0 such that for all 0 i I and 1 j
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
43
J we have that
E [ Ci,j | Ci,j1 ] = fj1 Ci,j1 ,
(3.10)
2
Ci,j1 .
Var (Ci,j | Ci,j1 ) = j1
(3.11)
2
Remark. In Mack [49] there are slightly weaker assumptions, namely the Markov
chain assumption is replaced by weaker assumptions on the first two moments of
(Ci,j )j .
We recall the results from Section 2.1 (see Lemma 2.5):
Choose the following estimators for the parameters fj and j2 :
i (j+1)
P
fbj =
Ci,j+1
i=0
i (j+1)
P
i (j+1)
i=0
Ci,j
i=0
bj2
Ci,j+1
Ci,j
,
Ci,j
P
Ck,j
i (j+1)
k=0
(3.12)
2
i (j+1)
X
Ci,j+1
1
Ci,j
fbj .
=
i (j + 1) i=0
Ci,j
Ci,j+1
,
Ci,j
(3.13)
then the age-to-age factor estimates fbj are weighted averages of Fi,j+1 , namely
i (j+1)
fbj =
X
i=0
Ci,j
i (j+1)
P
Fi,j+1 .
(3.14)
Ck,j
k=0
Lemma 3.3 Under Assumptions 3.2 the estimator fbj is the Bj+1 -measurable unbiased estimator for fj , which has minimal conditional variance among all linear
combinations of the unbiased estimators Fi,j+1 0ii (j+1) for fj , conditioned on
Bj , i.e.
i (j+1)
X
Var fbj |Bj = minVar
i Fi,j+1 Bj .
(3.15)
i R
i=0
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
44
j2 /
Ci,j .
(3.16)
i=0
P =
(Ph /h2 )
h=1
H
P
(3.17)
(1/h2 )
h=1
with
Var(P ) =
H
X
!1
(1/h2 )
(3.18)
h=1
Proof. See Proposition 12.1 in Taylor [75] (the proof is based on the method of
Lagrange).
2
Proof of Lemma 3.3. Consider the individual development factors
Fi,j+1 =
Ci,j+1
.
Ci,j
(3.19)
(3.20)
Var(fbj |Bj ) = j2 /
Ci,j .
(3.21)
i=0
2
Lemma 3.5 Under Assumptions 3.2 we have:
2
a)
bj2 is, given Bj , an unbiased estimator for j2 , i.e. E
bj Bj = j2 ,
2
b)
bj2 is (unconditionally) unbiased for j2 , i.e. E
bj = j2 .
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
45
(3.23)
Ci,k
Ci,k
1
1
Pi (k+1)
Ci,k
Ci,k
i=0
(3.25)
!
.
(3.26)
"
2 #
i (k+1)
X
1
Ci,k+1
Ci,k E
fbk Bk = k2 ,
i (k + 1) i=0
Ci,k
(3.27)
which proves the claim a). This finishes the proof of Lemma 3.5.
2
The following equality plays an important role in the derivation of an estimator for
the conditional estimation error
h i
k2
2
b
E fk Bk = Var fbk Bk + fk2 = Pi (k+1)
+ fk2 .
(3.28)
Ci,k
i=0
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
46
In Estimator 2.4 we have already seen how we estimate the ultimate claim Ci,J ,
given the information DI in the chain-ladder model:
CL
d
b [ Ci,J | DI ] = Ci,Ii fbIi fbJ1 .
C
=E
i,J
(3.29)
Our goal is to derive an estimate for the conditional mean square error of prediction
CL
d
(conditional MSEP) of C
for single accident years i {I J + 1, . . . , I}
i,J
2
CL
CL
d
d
msepCi,J |DI C
= E C
Ci,J DI
(3.30)
i,J
i,J
CL
d
= Var (Ci,J | DI ) + C
E [Ci,J |DI ]
i,J
!
I
I
X
X
CL
CL
d
d
i,J
i,J
2
!2
Ci,J DI .
i=IJ+1
i=IJ+1
i=IJ+1
(3.31)
From (3.30) we see that we need to give an estimate for the process variance and
for the estimation error (coming from the fact that fj is estimated by fbj ).
3.2.2
I
X
We consider the first term on the right-hand side of (3.30), which is the conditional
process variance. Assume J > I i,
Var (Ci,J | DI ) = Var (Ci,J | Ci,Ii )
(3.32)
2
J1
J2
Y
Ci,Ii
2
fj + fJ1
Var ( Ci,J1 | Ci,Ii ) .
j=Ii
Hence we obtain a recursive formula for the process variance. If we iterate this
procedure, we find that
Var ( Ci,J | Ci,Ii ) = Ci,Ii
J1
X
J1
Y
fn2
2
m
m=Ii n=m+1
J1
X
J1
Y
m1
Y
fl
l=Ii
2
fn2 m
E [Ci,m | Ci,Ii ]
m=Ii n=m+1
J1
2 X
E [Ci,J | Ci,Ii ]
m=Ii
2
2
m
/fm
.
E [Ci,m | Ci,Ii ]
(3.33)
47
Lemma 3.6 (Process variance for single accident years) Under Model Assumptions 3.2 the conditional process variance for the ultimate claim of a single
accident year i {I J + 1, . . . , I} is given by
J1
2 X
Var (Ci,J | DI ) = E [Ci,J | Ci,Ii ]
m=Ii
2
2
m
/fm
.
E [Ci,m | Ci,Ii ]
(3.34)
Hence we estimate the conditional process variance for a single accident year i by
d (Ci,J | DI ) = E
b (Ci,J E [Ci,J | DI ])2 DI
Var
J1
2 b2
X
CL 2
bm
/fm
d
= Ci,J
.
(3.35)
d CL
m=Ii Ci,m
The estimator for the conditional process variance can be rewritten in a recursive
form. We obtain for i {I J + 1, . . . , I}
2
d (Ci,J | DI ) = Var
d (Ci,J1 | DI ) fb2 +
\
Var
bJ1
C
i,J1
J1
CL
(3.36)
d (Ci,Ii | DI ) = 0.
where Var
Because different accident years are independent, we estimate the conditional process variance for aggregated accident years by
!
I
I
X
X
d
d (Ci,J | DI ) .
Var
Var
(3.37)
Ci,J DI =
i=IJ+1
i=IJ+1
bJ1
= min
bJ2 /b
J3
;
bJ3
;
bJ2
(3.38)
2
as estimate for J1
. This estimate is motivated by the observation that the series 0 , . . . , J2 is usually decreasing (cf. Table 3.1). This gives the estimated
conditional process standard deviations in Table 3.2.
We define the estimated conditional variational coefficient for accident year i relative to the estimated CL reserves as follows:
d (Ci,J | DI )1/2
Var
d
Vcoi = Vco ( Ci,J | DI ) =
.
(3.39)
CL
d
C
Ci,Ii
i,J
If we take this variational coefficient as a measure for the uncertainty, we see that
the uncertainty of the total CL reserves is about 7%.
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
48
1
2
3
4
5
6
7
8
9
10
fbj
bj
0
1.6257
1.5115
1.4747
1.4577
1.4750
1.4573
1.5166
1.4614
1.4457
1
1.0926
1.0754
1.0916
1.0845
1.0767
1.0635
1.0663
1.0683
2
1.0197
1.0147
1.0260
1.0206
1.0298
1.0255
1.0249
3
1.0192
1.0065
1.0147
1.0141
1.0244
1.0107
4
1.0057
1.0035
1.0062
1.0092
1.0109
5
1.0060
1.0050
1.0051
1.0045
6
1.0013
1.0011
1.0008
7
1.0010
1.0011
8
1.0014
1.4925
135.253
1.0778
33.803
1.0229
15.760
1.0148
19.847
1.0070
9.336
1.0051
2.001
1.0011
0.823
1.0010
0.219
1.0014
0.059
Ci,Ii
11148124
10648192
10635751
9724068
9786916
9935753
9282022
8256211
7648729
5675568
CL
d
C
i,J
11148124
10663318
10662008
9758606
9872218
10092247
9568143
8705378
8691971
9626383
CL reserves
0
15126
26257
34538
85302
156494
286121
449167
1043242
3950815
6047061
Vcoi
191
742
2669
6832
30478
68212
80077
126960
389783
424379
1.3%
2.8%
7.7%
8.0%
19.5%
23.8%
17.8%
12.2%
9.9%
7.0%
Table 3.2: Estimated chain-ladder reserves and estimated conditional process standard deviations
3.2.3
CL
d
C
E [Ci,J | DI ]
i,J
2
2
2
fbIi . . . fbJ1 fIi . . . fJ1
= Ci,Ii
2
= Ci,Ii
J1
Y
j=Ii
fbj2 +
J1
Y
fj2 2
j=Ii
J1
Y
(3.40)
!
fbj fj
j=Ii
Hence we would like to calculate (3.40). Observe that the realizations of the estimators fbIi , . . . , fbJ1 are known at time I, but the true chain-ladder factors
fIi , . . . , fJ1 are unknown. Hence (3.40) can not be calculated explicitly. In order
to determine the conditional estimation error we will analyze how much the posc
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
49
(3.41)
(3.42)
the upper right corner of the observations DI with respect to development year
j = I i + 1.
accident
year i
0
..
.
i
..
.
I
development year j
...
I i
...
O
DI,i
O
Table 3.3: The upper right corner DI,i
This is the complete averaging over the multidimensional distribution after time
O
I i. Since DI,i
BIi = holds true the value (3.43) does not depend on
O
O
the observations in DI,i
. I.e. the observed realizations in the upper corner DI,i
have no influence on the estimation of the parameter error. Therefore we call this
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
50
O
Approach 3 (Conditional resampling in DI,i
). Calculate the value
i
i
h
i
h
h
2
2
2
b
b
b
E fIi BIi E fIi+1 BIi+1 . . . E fJ1 BJ1 .
(3.45)
Unlike the approach (3.44) the averaging is now done in every position j {I
O
Bj 6= if j > Ii the observed
i, . . . , J 1} on the conditional structure. Since DI,i
O
have a direct influence on the estimate and (3.45) depends on
realizations in DI,i
O
. In contrast to (3.43) the averaging is only done over the
the observations in DI,i
conditional distributions and not over the multidimensional distribution after I i.
Therefore we call this the conditional version. From a numerical point of view it
is important to note that Approach 3 allows for a multiplicative structure of the
measure of volatility (see Figure 3.1).
Concluding, this means that we consider different probability measures for the
resampling, conditional and unconditional ones. Observe that the estimated chainladder factors fbj are functions of (Ci,j+1 )i=0,...,Ij1 and (Ci,j )i=0,...,Ij1 , i.e.
Ij1
P
!
fbj = fbj (Ci,j+1 )i=0,...,Ij1 , (Ci,j )i=0,...,Ij1
Ci,j+1
i=0
Ij1
P
(3.46)
Ci,j
i=0
(3.47)
51
O
In Approach 1 one considers a complete resampling on DI,i
, i.e. one looks, given
BIi , at the measures
(3.48)
k<i
k<i
fbj =
jIi
(3.49)
jIi
In Approach 3 we always keep fixed the set of actual observations Ci,j and we only
resample the next step in the time series, i.e. given DI we consider the measures
(see also Figure 3.1)
k<i
fbj =
(3.51)
jIi
Hence in this context Ci,j serves as a volume measure for the resampling of Ci,j+1 .
In Approach 1 this volume measure is also resampled, whereas in Approach 3 it is
kept fixed.
Observe. The question, as to which approach should be chosen, is not a mathematical one and has lead to extensive discussions in the actuarial community (see
Buchwalder et al. [11], Mack et al. [52], Gisler [29] and Venter [78]). It depends
on the circumstances of the questions as to which approach should be used for a
specific practical problem. Henceforth, only the practitioner can choose the appropriate approach for his problems and questions.
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
52
j=Ii
Hence, to give an estimate for the estimation error with the unconditional version,
we need to calculate the expectation in the last term of (3.52) (as described in
Approach 1). This would be easy, if the estimated chain-ladder factors fbj were
independent. But they are only uncorrelated, see Lemma 2.5 and the following
lemma (for a similar statement see also Mack et al. [52]):
Lemma 3.8 Under Model Assumptions 3.2 the squares of two successive chainladder estimators fbj1 and fbj are, given Bj1 , negatively correlated, i.e.
2
2
b
b
Cov fj1 , fj Bj1 < 0
(3.53)
for 1 j J 1.
Proof. Observe that fbj1 is Bj -measurable. We define
i (j+1)
Sj =
Ci,j .
(3.54)
i=0
i (j)
2
X
j
1
=
Ci,j , Bj1 .
2 Cov
Sj
Sj1
i=0
Moreover, using
2
i (j)
X
2
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
(3.56)
53
the independence of different accident years and E [CIj,j | Bj1 ] = fj1 CIj,j1
leads to
2
2
b
b
Cov fj1 , fj Bj1
(3.57)
j2
1
2 1
=
2 Cov Sj , Bj1 + 2 fj1 CIj,j1 Cov Sj , Bj1 .
Sj
Sj
Sj1
Finally, we need to calculate both covariance terms on the right-hand side of (3.57).
Using Jensens inequality we obtain for = 1, 2
1
Cov Sj , Bj1 = E Sj1 Bj1 E Sj Bj1 E Sj1 Bj1
(3.58)
Sj
< E S 1 Bj1 E [Sj | Bj1 ] E [Sj | Bj1 ]1 = 0,
j
J1
Y
j=Ii
#
fbj2 BIi
(3.59)
can not easily be calculated. Hence from this point of view Approach 1 is not a
promising route for finding a closed formula for the estimation error.
Approach 3 (conditional resampling)
In Approach 3 we explicitly resample the observed chain-ladder factors fbj . To do
the resampling we introduce stronger model assumptions. This is done with a time
series model. Such time series models for the chain-ladder method can be found
in several papers in the literature see e.g. Murphy [55], Barnett-Zehnwirth [7] or
Buchwalder et al. [13].
Model Assumptions 3.9 (Time series model)
Different accident years i are independent.
There exist constants fj > 0, j > 0 and random variables i,j+1 such that
for all i {0, . . . , I} and j {0, . . . , J 1} we have that
p
Ci,j+1 = fj Ci,j + j Ci,j i,j+1 ,
(3.60)
with conditionally, given B0 , i,j+1 are independent with E [i,j+1 | B0 ] = 0,
E 2i,j+1 B0 = 1 and P [Ci,j+1 > 0| B0 ] = 1 for all i {0, . . . , I} and j
{0, . . . , J 1}.
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
54
Remarks 3.10
The time series model defines an auto-regressive process. It is particularly
useful for the derivation of the estimation error and reflects the mechanism
of generating sets of other possible observations.
The random variables i,j+1 are defined conditionally, given B0 , in order to
ensure that the cumulative payments Ci,j+1 stay positive, P [| B0 ]-a.s.
It is easy to show that Model Assumptions 3.9 imply the Assumptions 3.2 of
the classical stochastic chain-ladder model of Mack [49].
The definition of the time series model in Buchwalder et al. [11] is slightly
different. The difference lies in the fact that here we assume a.s. positivity
of Ci,j . This could also be done with the help of conditional assumptions,
i.e. the theory would also run through if we would assume that
P [Ci,j+1 > 0| Ci,j ] = 1,
(3.61)
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
55
In the spirit of Approach 3 (cf. (3.45)) we resample the observations for fbj by
only resampling the observations of development year j + 1. Together with the
resampling assumption (3.62) this leads to the following resampled representation
for the estimates of the development factors
i (j+1)
P
fbj =
ei,j+1
C
i=0
i (j+1)
P
Ci,j
i (j+1)
j X p
= fj +
Ci,j ei,j+1
Sj
i=0
(0 j J 1),
(3.63)
i=0
where
i (j+1)
Sj =
Ci,j .
(3.64)
i=0
ED I
2
fbj
= f2 +
j
j2
Sj
for 0 j J 1.
Figure 3.1 illustrates the conditional resampling for two different possible obser(1)
(2)
vations DI and DI of the original data DI , which would give the two different
b(1) and C
b(2) for E [Ci,J | DI ].
chain-ladder estimates C
i,J
i,J
Therefore in Approach 3 we estimate the estimation error by (using 1)-3))
2
2
b
b
EDI Ci,Ii fIi . . . fJ1 fIi . . . fJ1
2
= Ci,Ii
VarPD fbIi . . . fbJ1
I
!
J1
J1
Y
Y
2
(3.65)
2
= Ci,Ii
ED I fbl
fl2
l=Ii
2
= Ci,Ii
J1
Y
l=Ii
l=Ii
2
fl2 + l
Sl
J1
Y
!
fl2
l=Ii
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
56
h
i
2
2
E fbIi
+
BIi = fIi
J1
Q
2
Ii
SIi
l=Ii
h i
J1
Q 2
E fbl2 Bl =
fl +
l=Ii
l2
Sl
b(1)
C
i,J
(1)
fbIi
E [Ci,J | DI ]
fIi
b(2)
C
i,J
Ci,Ii
(2)
fbIi
I i
...
O
Figure 3.1: Conditional resampling in DI,i
(Approach 3)
Observe that this calculation is exact, the estimation has been done at the point
where we have decided to use Approach 3 for the estimation error, i.e. the estimate
was done choosing the conditional probability measure PD I .
2
2
Next, we replace the parameters Ii
, . . . , J1
and fIi , . . . , fJ1 with their estimators, and we obtain the following estimator for the conditional estimation error
of accident year i {I J + 1, . . . , I}
2
CL
CL
d
d
b
d
Var Ci,J DI
= EDI Ci,J E [Ci,J | DI ]
2
= 1Ci,Ii
J1
Y
l=Ii
b2
fbl2 + l
Sl
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
J1
Y
l=Ii
(3.66)
!
fbl2
57
The estimator for the conditional estimation error can be written in a recursive
form. We obtain for i {I J + 1, . . . , I}
CL
d C
d
Var
DI
i,J
J2
2
2
Y
CL
b
J1
l
2
2
2
d C
\
fb +
(3.67)
= Var
DI fbJ1 + Ci,Ii
i,J1
SJ1 l=Ii l
Sl
J2
2
2
Y
CL
bJ1
bJ1
2
2
b
d
\
= Var Ci,J1 DI fJ1 +
+ Ci,Ii
fb2 ,
SJ1
SJ1 l=Ii l
CL
d C[
where Var
DI = 0.
i,Ii
Estimator 3.11 (MSEP for single accident years, conditional version)
Under Model Assumptions 3.9 we have the following estimator for the conditional
mean square of prediction of the ultimate claim of a single accident year i {I
J + 1, . . . , I}
2
CL
CL
d
d
b C
=E
Ci,J DI
msep
[ Ci,J |DI C
i,J
i,J
!
J1
J1
J1
X
2
2 b2
Y
Y
CL 2
b
/
f
(3.68)
l
l
2
d
= C
fbl2 + l
+ Ci,Ii
fbl2 .
i,J
CL
Sl
c
l=Ii
l=Ii Ci,l
l=Ii
{z
}
|
{z
} |
estimation error
process variance
d
msep
[ Ci,J |DI C
i,J
CL
d
= C
i,J
CL 2
J1
X
bl2 /fbl2
l=Ii
ci,l CL
C
J1
Y
l=Ii
!
!
bl2 /fbl2
+1 1 .
Sl
(3.69)
bl2 /fbl2
l
2
fbl2
fbl2
.
fbl +
S
S
l
l
l=Ii
l=Ii
l=Ii
l=Ii
(3.70)
Observe that in fact the right-hand side of (3.70) is a lower bound for the left-hand
side. This immediately gives the following estimate:
Estimator 3.12 (MSEP for single accident years)
Under Model Assumptions 3.9 we have the following estimator for the conditional
mean square error of prediction of the ultimate claim of a single accident year
i {I J + 1, . . . , I}
!
J1
X
2
CL
CL 2
b
1
1
l
[
d
d
+
.
(3.71)
msep
[ Ci,J |DI C
= C
i,J
i,J
2
b
ci,l CL Sl
C
l=Ii fl
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
58
2
J1
X
!2
Tj
(3.73)
j=Ii
d
C
i,J
CL
E [Ci,J | DI ]
2
2
= Ci,Ii
J1
X
j=Ii
!
Tj2 + 2
Tj Tk
(3.74)
Iij<kJ1
Each term in the sums on the right-hand side of the equality above is now estimated
by a slightly modified version of Approach 2: We estimate Tj Tk for j < k by
E [Tj Tk | Bk ]
(3.75)
n
o n
o
n
o
2
2
= fbIi
fbj1
fj fbj fbj fj+1 fbj+1 . . . fk1 fbk1
io
n
h
2
2
. . . fJ1
fk E fk fbk Bk fk+1
= 0,
and Tj2 is estimated by
2
2
2
2
2
E Tj2 Bj = fbIi
fbj1
E fj fbj Bj fj+1
. . . fJ1
2
2
= fbIi
fbj1
(3.76)
j2 2
2
.
f . . . fJ1
Sj j+1
Ci,Ii
J1
X
2
2
fbIi
fbj1
j=Ii
j2 2
2
fj+1 . . . fJ1
.
Sj
(3.77)
i,J
59
We see that the Mack estimate for the conditional estimation error (also
presented in Estimator 3.12) is a linear approximation and lower bound to
the estimate coming from Approach 3.
The difference comes from the fact that Mack [49] decouples the estimation
error in an appropriate way (with the help of the terms Tj ) and then applies
a partial conditional resampling to each of the terms in the decoupling.
The Time Series Model 3.9 has slightly stronger assumptions than the weighted
average development (WAD) factor model studied in Murphy [55], Model IV.
To obtain the crucial recursive formula for the conditional estimation error
(Theorem 3 in Appendix C of [55]) Murphy assumes independence for the estimators of the chain-ladder factors. However, this assumption is inconsistent
with the model assumptions since the chain-ladder factors indeed are uncorrelated (see Lemma 2.5c)) but the squares of two successive chain-ladder
estimators are negatively correlated as we can see from Lemma 3.8. The
point is that by his assumptions Murphy [55] gets a multiplicative structure
of the measure of volatility. In Approach 3 we get the multiplicative structure by the choice of the conditional resampling (probability measure PD I for
the measure of the (conditional) volatility of the chain-ladder estimator (see
discussion in Section 3.2.3). This means, in Approach 3 we do not assume
that the estimated chain-ladder factors are independent. Henceforth, since
in both estimators a multiplicative structure is used it turns out that the
recursive estimator (3.67) for the conditional estimation error is exactly the
estimator presented in Theorem 3 of Murphy [55] (see also Appendix B in
Barnett-Zehnwirth [7]).
Example 3.7 revisited
We come back to our example in Table 2.2. This gives the following error estimates:
From Tables 3.4 and 3.5 we see that the differences in the estimates for the conditional estimation error coming from the linear approximation (Mack formula) are
negligible. In all examples we have looked at we came to this conclusion.
3.2.4
Consider two different accident years i < l. From the model assumptions we know
that the ultimate losses Ci,J and Cl,J are independent. Nevertheless we have to be
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
60
CL
d
C
i,J
CL reserves
0
1
2
3
4
5
6
7
8
9
11148124
10663318
10662008
9758606
9872218
10092247
9568143
8705378
8691971
9626383
15126
26257
34538
85302
156494
286121
449167
1043242
3950815
1/2
CL
d C
d
Var
|DI
i,J
d
msep
[ Ci,J |DI (C
i,J
191
742
2669
6832
30478
68212
80077
126960
389783
187
535
1493
3392
13517
27286
29675
43903
129770
267
914
3058
7628
33341
73467
85398
134337
410817
1.3%
2.8%
7.7%
8.0%
19.5%
23.8%
17.8%
12.2%
9.9%
1.2%
2.0%
4.3%
4.0%
8.6%
9.5%
6.6%
4.2%
3.3%
CL 1/2
)
1.8%
3.5%
8.9%
8.9%
21.3%
25.7%
19.0%
12.9%
10.4%
Table 3.4: Estimated chain-ladder reserves and error terms according to Estimator
3.11
i
CL
d
C
i,J
CL reserves
0
1
2
3
4
5
6
7
8
9
11148124
10663318
10662008
9758606
9872218
10092247
9568143
8705378
8691971
9626383
15126
26257
34538
85302
156494
286121
449167
1043242
3950815
1/2
CL
d C
d
Var
|DI
i,J
CL 1/2
[
d
msep
[ Ci,J |DI (C
)
i,J
191
742
2669
6832
30478
68212
80077
126960
389783
187
535
1493
3392
13517
27286
29675
43903
129769
267
914
3058
7628
33341
73467
85398
134337
410817
1.3%
2.8%
7.7%
8.0%
19.5%
23.8%
17.8%
12.2%
9.9%
1.2%
2.0%
4.3%
4.0%
8.6%
9.5%
6.6%
4.2%
3.3%
1.8%
3.5%
8.9%
8.9%
21.3%
25.7%
19.0%
12.9%
10.4%
Table 3.5: Estimated chain-ladder reserves and error terms according to Estimator
3.12
CL
CL
d
d
careful if we aggregate C
and C
. The estimators are no longer independent
i,J
l,J
since they use the same observations for estimating the age-to-age factors fj . We
have that
2
CL
CL
CL
CL
d
d
d
d
=E C
+C
(Ci,J + Cl,J ) DI
+C
msepCi,J +Cl,J |DI C
l,J
i,J
i,J
i,J
2
CL
CL
d
d
= Var (Ci,J + Cl,J | DI ) + C
+
C
E
[C
+
C
|
D
]
.
i,J
l,J
i,J
l,J
I
(3.78)
Using the independence of the different accident years, we obtain for the first term
Var (Ci,J + Cl,J | DI ) = Var (Ci,J | DI ) + Var (Cl,J | DI ) ,
(3.79)
2
CL
CL
d
d
C
+C
E [Ci,J + Cl,J | DI ]
i,J
l,J
2
2
CL
CL
d
d
= C
E [Ci,J | DI ] + C
E [Cl,J | DI ]
i,J
l,J
CL
CL
d
d
+2 Ci,J E [Ci,J | DI ] Cl,J E [Cl,J | DI ] .
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
(3.80)
61
Hence we have the following decomposition for the conditional prediction error of
the sum of two accident years
2
CL
CL
d
d
E Ci,J + Cl,J (Ci,J + Cl,J ) DI
2
2
CL
CL
d
d
=E C
Ci,J DI + E C
Cl,J DI
(3.81)
i,J
l,J
CL
CL
d
d
+2 Ci,J E [Ci,J | DI ] Cl,J E [Cl,J | DI ] .
Hence we obtain
CL
CL
d
d
msepCi,J +Cl,J |DI C
+C
i,J
i,J
CL
CL
d
d
= msepCi,J |DI Ci,J
+ msepCl,J |DI Cl,J
CL
CL
d
d
E
[C
|
D
]
E
[C
|
D
]
.
+2 C
i,J
i,J
I
l,J
l,J
I
(3.82)
j=Ii
j=Ii
j=Il
j=Il
b
b
b
b
= Ci,Ii Cl,Il Cov
fIi . . . fJ1 , fIl . . . fJ1
= Ci,Ii Cl,Il fIl . . . fIi1 VarPD fbIi . . . fbJ1
I
!
J1
J1
Y
Y
2
= Ci,Ii Cl,Il fIl . . . fIi1
ED I fbj
fj2
PD
I
j=Ii
= Ci,Ii E [ Cl,Ii | DI ]
J1
Y
j=Ii
fj2 +
j2
Sj
j=Ii
J1
Y
!
fj2
j=Ii
But then the estimation of the covariance term is straightforward from the estimate
of a single accident year.
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
62
!
!2
I
I
I
X
X
X
CL
CL
d
b
d
msep
[ P Ci,J |DI
C
=E
C
Ci,J DI
i,J
i,J
i
i=IJ+1
i=IJ+1
i=IJ+1
=
I
X
msep
[ Ci,J |DI
CL
d
C
i,J
i=IJ+1
+2
Ci,Ii C[
l,Ii
CL
J1
Y
j=Ii
IJ+1i<lI
bj2
fbj2 +
Sj
J1
Y
!
fbj2
. (3.85)
j=Ii
Remarks 3.15
The last terms (covariance terms) from the result above can be rewritten as
2
X
IJ+1i<lI
CL
CL
C[
l,Ii
d
d
Var Ci,J DI ,
Ci,Ii
(3.86)
CL
d
d
where Var Ci,J DI is the conditional estimation error of the single accident year i (see (3.66)). This may be helpful in the implementation since it
leads to matrix multiplications.
We can again do a linear approximation and then we find the estimator
presented in Mack [49].
Example 3.7 revisited
We come back to our example in Table 2.2. This gives the error estimates in Table
3.6.
3.3
In this section we further analyze the conditional mean square error of prediction of
the chain-ladder method. In fact, we consider three different kinds of error terms: a)
conditional process error, b) conditional prediction error, c) conditional estimation
error. To analyze these three terms we define a model, which is different from
the classical chain-ladder model. It is slightly more complicated than the classical
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
63
1/2
CL
d C
d
Var
|DI
i,J
d
msep
[ Ci,J |DI (C
i,J
15126
26257
34538
85302
156494
286121
449167
1043242
3950815
191
742
2669
6832
30478
68212
80077
126960
389783
1.3%
2.8%
7.7%
8.0%
19.5%
23.8%
17.8%
12.2%
9.9%
6047061
424379
7.0%
187
535
1493
3392
13517
27286
29675
43903
129770
116811
185026
267
914
3058
7628
33341
73467
85398
134337
410817
116811
462960
CL
d
C
i,J
CL reserves
11148124
10663318
10662008
9758606
9872218
10092247
9568143
8705378
8691971
9626383
1.2%
2.0%
4.3%
4.0%
8.6%
9.5%
6.6%
4.2%
3.3%
3.1%
CL 1/2
)
1.8%
3.5%
8.9%
8.9%
21.3%
25.7%
19.0%
12.9%
10.4%
7.7%
Table 3.6: Estimated chain-ladder reserves and error terms (Estimator 3.14)
model but therefore leads to a clear distinction between these error terms. The
motivation for a clear distinction between the three error terms is that the sources
of these error classes are rather different ones and we believe that in the light of the
solvency discussions (see e.g. SST [73], Sandstrom [67], Buchwalder et al. [11, 14]
or W
uthrich [88]) we should clearly distinguish between the different risk factors.
In this section we closely follow W
uthrich [90]. For a similar Bayesian approach
we also refer to Gisler [29].
3.3.1
The observed individual development factors were defined by (see also (3.19))
Fi,j =
Ci,j
,
Ci,j1
(3.87)
and
2
j1
.
Ci,j1
(3.88)
The conditional variational coefficients of the development factors Fi,j are given by
Vco (Fi,j | Ci,j1 ) = Vco (Ci,j | Ci,j1 ) =
j1
1/2
Ci,j1 0,
fj1
as Ci,j1 .
(3.89)
Hence for increasing volume the conditional variational coefficients of Fi,j converge
to zero! It is exactly this property (3.89) which is crucial in risk management. If
we assume that risk is defined through these variational coefficients, it means that
the risk completely disappears for very large portfolios (law of large numbers). But
we all know that this is not the case in practice. There are always external factors, which influence a portfolio and which are not diversifiable, e.g. if jurisdiction
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
64
changes it is not helpful to have a large portfolio, etc. Also the experiences in
recent years have shown that we have to be very careful about external factors and
parameter errors since they can not be diversified. Therefore, in almost all developments of new solvency guidelines and requirements one pays a lot of attention
to these risks. The goal here is to define a chain-ladder model, which reflects also
this kind of risk class.
3.3.2
The approach in this section modifies (3.89) as follows. We assume that there exist
constants a20 , a21 , . . . 0 such that for all 1 j J we have that
Vco2 (Fi,j | Ci,j1 ) =
2
j1
1
Ci,j1
+ a2j1 .
2
fj1
(3.90)
Hence
Vco2 (Fi,j | Ci,j1 ) >
lim
Ci,j1
(3.91)
which is now bounded from below by a2j1 . This implies that we replace the chainladder condition on the variance by
2
2
2
Var (Ci,j | Ci,j1 ) = j1
Ci,j1 + a2j1 fj1
Ci,j1
.
(3.92)
This means that we add a quadratic term to ensure that the variational coefficient
does not disappear when the volume is going to infinity.
As above, we define the chain-ladder consistent time series model. This time series
model gives an algorithm how we should simulate additional observations. This
algorithm will be used for the calculation of the estimation error.
Model Assumptions 3.16 (Enhanced time series model)
Different accident years i are independent.
There exist constants fj > 0, j2 > 0, a2j 0 and random variables i,j+1 such
that for all i {0, . . . , I} and j {0, . . . , J 1} we have that
Ci,j+1 = fj Ci,j + j2 + a2j fj2 Ci,j
1/2 p
Ci,j i,j+1 ,
(3.93)
65
3.3.3
Interpretation
In this subsection we give an interpretation to the variance term (3.92). Alternatively, we could use a model with latent variables i,j . This is similar to the
Bayesian approaches such as used in Gisler [29] saying that the true chain-ladder
factors fj are themselves random variables (depending on external/latent factors).
(A1) Conditionally, given i,j , we have
E [Ci,j+1 | i,j , Ci,j ] = fj (i,j ) Ci,j ,
Var (Ci,j+1 | i,j , Ci,j ) =
j2 (i,j )
Ci,j .
(3.94)
(3.95)
(3.96)
(3.97)
(3.98)
2
(3.99)
(3.100)
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
(3.101)
66
a) Conditional process error / Conditional process variance. The conditional process error corresponds to the term
j2 Ci,j
(3.102)
and reflects the fact that Ci,j+1 are random variables which have to be predicted.
For increasing volume Ci,j the variational coefficient of this term disappears.
b) Conditional Prediction error. The conditional prediction error corresponds
to the term
2
a2j fj2 Ci,j
(3.103)
and reflects the fact that we have to predict the future development factors fj (i,j ).
These future development factors underlay also some uncertainty, and hence may
be modelled stochastically (Bayesian point of view). The Mack formula and the
Estimator 3.14 for the conditional mean square error of prediction does not consider
this kind of risk.
c) Conditional estimation error. There is a third kind of risk, namely the
risk which comes from the fact that we have to estimate the true parameters fj in
(3.96) from the data. This error term will be called conditional estimation error.
It is also considered in the Mack model and in Estimator 3.14. For the derivation
of an estimate for this term we will use Approach 3, page 50. This derivation will
use the time series definition of the chain-ladder method.
3.3.4
1/2
i,j+1 ,
(3.104)
with
E [Fi,j+1 | Ci,j ] = fj
and
1
Var (Fi,j+1 | Ci,j ) = j2 Ci,j
+ a2j fj2 .
(3.105)
J1
Y
j=Ii
fj .
(3.106)
67
2
3.3.5
We derive now the recursive formula for the conditional process and prediction error: Under Model Assumptions 3.16 we have for the ultimate claim Ci,J of accident
year i > I J that
Var ( Ci,J | DI ) = Var (Ci,J | Ci,Ii )
(3.107)
2
J1
J2
Y
(3.108)
2
fj Ci,Ii + a2J1 fJ1
Var (Ci,J1 | DI ) + E [Ci,J1 | Ci,Ii ]2
j=Ii
2
= Ci,Ii
J2
J1
2
Y
Y
J1
2
fj + aJ1
fj2
Ci,Ii j=Ii
j=Ii
!
2
+ a2J1 fJ1
Var (Ci,J1 | DI ) .
For the second term on the right-hand side of (3.107) we obtain under Model
Assumptions 3.16
Var (E [Ci,J | Ci,J1 ]| Ci,Ii ) = Var (fJ1 Ci,J1 | Ci,Ii )
(3.109)
2
= fJ1
Var (Ci,J1 | DI ) .
J2
J1
2
Y
Y
J1
2
fj + aJ1
fj2
Ci,Ii j=Ii
j=Ii
2
+(1 + a2J1 ) fJ1
Var (Ci,J1 | DI ) .
!
(3.110)
68
!#
m
m1
2
Y
Y
m
2
fj2
(1 + a2n ) fn2
fj + a2m
= Ci,Ii
C
i,Ii
j=Ii
m=Ii n=m+1
j=Ii
" J1
#
J1
2
2
X
Y
/f
m
m
= E [ Ci,J | DI ]2
+ a2m
(1 + a2n ) .
(3.111)
E
[
C
|
D
]
i,m
I
n=m+1
m=Ii
J1
X
J1
Y
Lemma 3.19 implies that the conditional variational coefficient of the ultimate Ci,J
is given by
" J1
#1/2
J1
2
2
X
Y
m
/fm
(1 + a2n )
.
(3.112)
Vco ( Ci,J | DI ) =
+ a2m
E
[C
|
D
]
i,m
I
n=m+1
m=Ii
Henceforth we see that the conditional prediction error of Ci,J corresponds to
(the conditional process error disappears for infinitely large volume Ci,Ii )
#1/2
" J1
J1
Y
X
2
2
(1 + an )
,
(3.113)
lim Vco ( Ci,J | DI ) =
am
Ci,Ii
n=m+1
m=Ii
and the conditional variational coefficient for the conditional process error of
Ci,J is given by
" J1
#1/2
J1
2
2
X
Y
/fm
m
(1 + a2n )
.
(3.114)
E
[C
|
D
]
i,m
I
n=m+1
m=Ii
3.3.6
The conditional estimation error comes from the fact that we have to estimate the
fj from the data.
Estimation Approach 1
From Lemma 3.4 we obtain the following lemma:
Lemma 3.20 Under Model Assumptions 3.16, the estimator
i (j+1)
P
Fbj =
Ci,j
j2 +a2j fj2 Ci,j
i=0
i (j+1)
P
i=0
i (j+1)
P
Fi,j+1
Ci,j
j2 +a2j fj2 Ci,j
i=0
i (j+1)
P
i=0
Ci,j+1
j2 +a2j fj2 Ci,j
Ci,j
j2 +a2j fj2 Ci,j
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
(3.115)
69
is the Bj+1 -measurable unbiased estimator for fj , which has minimal conditional
variance among all linear combinations of the unbiased estimators (Fi,j+1 )0ii (j+1)
for fj , conditioned on Bj , i.e.
i (j+1)
X
b
Var(Fj |Bj ) = minVar
i Fi,j+1 Bj .
(3.116)
i R
i=0
The conditional variance is given by
i (j+1)
X
Var Fbj Bj =
i=0
1
j2
Ci,j
+ fj2 Ci,j
a2j
(3.117)
Proof. From (3.105) we see that Fi,j+1 is an unbiased estimator for fj , conditioned
on Bj , with
E [ Fi,j+1 | Bj ] = E [Fi,j+1 | Ci,j ] = fj ,
(3.118)
1
Var (Fi,j+1 | Bj ) = Var (Fi,j+1 | Ci,j ) = j2 Ci,j
+ a2j fj2 .
(3.119)
(CL,2)
b [Ci,j | DI ] = Ci,Ii
=E
j1
Y
Fbl .
(3.120)
l=Ii
for i + j > I.
We obtain the following lemma for the estimators in the enhanced time series
model:
Lemma 3.22 Under Assumptions 3.16 we have:
h i
b
a) Fj is, given Bj , an unbiased estimator for fj , i.e. E Fbj Bj = fj ,
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
70
h
i
(CL,2)
(CL,2)
d
d
e) C
is
(unconditionally)
unbiased
for
E
[C
],
i.e.
E
C
=
i,J
i,J
i,J
E [Ci,J ].
Proof. See proof of Lemma 2.5.
2
Single accident years
In the sequel of this subsection we assume that the parameters in (4.62) are known
to calculate Fbj .
Our goal is to estimate the conditional mean square error of prediction (conditional
MSEP) as in the classical chain-ladder model
(CL,2) 2
(CL,2)
d
d
DI
= E Ci,J C
(3.121)
msepCi,J |DI C
i,J
i,J
(CL,2)
d
= Var ( Ci,J | DI ) + E [Ci,J | DI ] C
i,J
2
The first term is exactly the conditional process variance and the conditional prediction error obtained in Lemma 3.19, the second term is the conditional estimation
error. It is given by
!2
J1
J1
Y
Y
(CL,2) 2
d
E [Ci,J | DI ] C
= C2
fj
Fbj .
(3.122)
i,J
i,Ii
j=Ii
j=Ii
Observe that
i (j+1)
P
Fbj =
Ci,j
j2 +a2j fj2 Ci,j
i=0
i (j+1)
P
i=0
= fj +
Fi,j+1
1
i (j+1)
P
i=0
(3.123)
Ci,j
j2 +a2j fj2 Ci,j
Ci,j
j2 +a2j fj2 Ci,j
i (j+1)
X
i=0
Ci,j
2
2
j + aj fj2 Ci,j
1/2
i,j+1 .
Hence Fbj consists of a constant fj and a stochastic error term (see also Lemma
3.20). In order to determine the conditional estimation error we now proceed as
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
71
in Section 3.2.3 for the Time Series Model 3.9. This means that we use Approach
O
3 (conditional resampling in DI,i
, page 50) to estimate the fluctuations of the
b
b
estimators F0 , . . . , FJ1 around the chain-ladder factors f0 , . . . , fJ1 , i.e. to get an
estimate for (3.122).
We therefore (conditionally) resample the observations Fb0 , . . . , FbJ1 , given DI , and
use the resampled estimates to calculate an estimate for the conditional estimation error. For these resampled observations we again use the notation PD I for
the conditional measure (for a more detailed discussion we refer to Section 3.2.3).
Moreover, under PD I , the random variables Fbj are independent with
h
ED I Fbj = fj
and ED I
Fbj
2
i (j+1)
X
2
= fj +
j2
i=0
Ci,j
+ fj2 Ci,j
(3.124)
a2j
(cf. Section 3.2.3, Approach 3). This means that the conditional estimation error
(3.122) is estimated by
!2
!
J1
J1
J1
Y
Y
Y
2
2
Fbj = Ci,Ii
ED I Ci,Ii
fj
VarPD
Fbj
I
j=Ii
j=Ii
2
Ci,Ii
J1
Y
j=Ii
ED I
Fbj
2
j=Ii
2
= Ci,Ii
fj2
j=Ii
J1
Y
fj2
(3.125)
j=Ii
J1
Y
J1
Y
i (j+1)
j=Ii
k=0
Ck,j
j2
fj2
+ a2j Ck,j
j=Ii
2
Ci,Ii
J1
Y
I
j=Ii
J1
Y
j=Ii
Fbj
i (j+1)
J1
X
X
j=Ii
j=Ii
fj2
+ 1 1 .
!2
J1
J1
Y
Y
2
2
ED I Ci,Ii
fj
Fbj = Ci,Ii
VarPD
k=0
Ck,j
j2
fj2
a2j
Ck,j
(3.126)
For aj = 0 this is exactly the conditional estimation error in the Mack Model 3.2.
For increasing number of observations (accident years i) this error term goes to
zero.
If we use the linear approximation (3.126) and if we replace the parameters in
(3.111) and (3.126) by their estimators (cf. Section 3.3.7) we obtain the following
estimator for the conditional mean square error of prediction (for the time being
we assume that j2 and a2j are known).
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
72
i,J
i,J
2
(CL,2)
cj C
d
F
j=Ii
i,j
1
#
i (j+1)
J1
Y
X
C
k,j
(1 + a2n ) +
.
j2
2
n=j+1
k=0 c 2 + aj Ck,j
Fj
E
[C
+
C
|
D
]
k,J
i,J
k,J
i,J
I
Using the independence of the different accident years, we obtain for the first term
Var (Ck,J + Ci,J | DI ) = Var (Ck,J | DI ) + Var (Ci,J | DI ) .
(3.129)
This term is exactly the conditional process and prediction error from Lemma 3.19.
For the second term (3.128) we obtain
2
(CL,2)
(CL,2)
d
d
C
+
C
E
[C
+
C
|
D
]
k,J
i,J
k,J
i,J
I
2
2
(CL,2)
(CL,2)
d
d
= Ck,J
E [Ck,J | DI ] + Ci,J
E [Ci,J | DI ]
(3.130)
(CL,2)
(CL,2)
d
d
E [Ck,J | DI ] C
E [Ci,J | DI ] .
+2 C
k,J
i,J
Hence we have the following decomposition for the conditional mean square error
of prediction error of the sum of two accident years
2
(CL,2)
(CL,2)
d
d
E C
+C
(Ck,J + Ci,J ) DI
k,J
i,J
2
2
(CL,2)
(CL,2)
d
d
= E Ck,J
Ck,J DI + E Ci,J
Ci,J DI
(3.131)
(CL,2)
(CL,2)
d
d
+2 Ck,J
E [Ck,J | DI ] Ci,J
E [Ci,J | DI ] .
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
73
In addition to the conditional MSEP of single accident years (see Estimator 3.23),
we need to average the covariance terms over the possible values of Fbj similar to
(3.122):
(CL,2)
(CL,2)
d
d
Ck,J
E [Ck,J | DI ] Ci,J
E [Ci,J | DI ]
!
!
J1
J1
J1
J1
Y
Y
Y
Y
= Ck,Ik
Fbl
fl Ci,Ii
Fbl
fl .
l=Ik
l=Ik
l=Ii
(3.132)
l=Ii
J1
Y
ED I Ck,Ik Ci,Ii
= Ck,Ik Ci,Ii
Fbj
j=Ik
Ik1
Y
J1
Y
fj
Ik1
Y
J1
Y
J1
Y
fj
ED I
j=Ii
2
Fbj
fj
J1
Y
!
fj2
(3.133)
j=Ik
fj2
j=Ik
i (j+1)
j=Ik
j=Ii
!#
J1
Y
Fbj
j=Ik
j=Ii
J1
Y
fj
j=Ik
j=Ii
= Ck,Ik Ci,Ii
J1
Y
m=0
Cm,j
j2
fj2
+ a2j Cm,j
+ 1 1 .
I
X
d
C
i,J
IJ+1
+2
(CL,2)
!
=
I
X
(CL,2)
d
msep
[ Ci,J |DI C
i,J
(3.134)
i=IJ+1
(CL,2)
(CL,2)
d
d
C
C
k,J
i,J
IJ+1k<iI
J1
X
j=Ik
i (j+1)
m=0
Cm,j
j2
cj
F
+ a2j Cm,j
Estimation Approach 2
In the derivation of the estimate Fbj , see (4.62), we have seen that we face the
problem, that the parameters need already be known in order to estimate them.
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
74
cj (0) =
F
Ci,j
i=0
i (j+1)
P
i (j+1)
P
Fi,j+1
Ci,j+1
i=0
i (j+1)
P
=
Ci,j
i=0
(3.135)
Ci,j
i=0
cj (0) = fbj is the classical chain-ladder estimator in the Mack Model 3.2. It is
F
optimal under the Mack variance condition, but it is not optimal under our variance
condition (3.90). Observe that
i (j+1)
(0)
c
Var Fj Bj
=
i (j+1)
P
X
!2
(3.136)
i=0
Ci,j
i=0
i (j+1)
P
i=0
2
j2 Ci,j1 + a2j fj2 Ci,j
!2
i (j+1)
P
Ci,j
i=0
a2j
j2
i (j+1)
P
fj2
i (j+1)
P
i=0
+
i (j+1)
P
Ci,j
i=0
2
Ci,j
!2 .
Ci,j
i=0
J1
Y
cj (0)
F
(3.137)
j=Ii
defines a conditionally, given Ci,Ii , unbiased estimator for E [Ci,J | DI ]. The process variance and the prediction error is given by Lemma 3.19.
For the estimation error of a single accident year in Approach 3 we obtain the
estimate
i (j+1)
P
2
2
2
Ci,j
a
J1
J1
2
j
j
Y
Y
j
i=0
2
2
2
+
fj . (3.138)
Ci,Ii
i (j+1)
!2 + fj
i (j+1)
P
j=Ii
j=Ii
Ci,j
Ci,j
i=0
i=0
3.3.7
75
Parameter estimation
Estimation of aj . The sequence aj can usually not be estimated from the data,
unless we have a very large portfolio (Ci,j ), such that the conditional process
error disappears. Hence aj can only be obtained if we have data from the whole
insurance market. This kind of considerations have been done for the determination
of the parameters for prediction errors in the Swiss Solvency Test (see e.g. Tables
6.4.4 and 6.4.7 in [73]). Unfortunately, the tables only give an overall estimate for
the conditional prediction error, not a sequence aj (e.g. the variational coefficient of
the overall error (similar to (3.101)) for motor third party liability claims reserves
is 3.5%).
We reconstruct aj with the help of (3.113). Define for j = 0, . . . , J 1
Vj2 =
J1
X
J1
Y
a2m
m=j1
(1 + a2n ).
(3.139)
n=m+1
2
:
Hence aj1 can be determined recursively from Vj2 Vj+1
2
a2j1 = Vj2 Vj+1
Y
J1
(1 + a2n )1 .
(3.140)
n=j
lim
Ci,j1
(3.141)
(cf. (3.113)). Hence we need to estimate the conditional prediction error of Ci,J ,
given the observation Ci,j1 . Since we do not really have a good idea/guess about
the conditional variational coefficient in (3.141) we express the conditional variac
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
76
(3.142)
In our examples we assume that the conditional variational coefficient for the conditional prediction error of the reserves Ci,J Ci,j1 is constant equal to r and we
set
J1
Q b (0)
Fl 1
l=j1
Vbj = r
.
(3.143)
J1
Q b (0)
Fl
l=j1
i (j+1)
P
2
Ci,j
i (j+1)
a2j fj2
X
i=0
2
= j +
Ci,j i (j+1)
.
i (j + 1) i=0
P
Ci,j
(3.144)
i=0
(k)
i (j+1)
2
X
1
(0)
Ci,j Fi,j+1 Fbj
=
i (j + 1) i=0
i (j+1)
2
P
2
(k1)
Ci,j
(j+1)
iX
abj 2 Fbj
i=0
Ci,j i (j+1)
i (j + 1)
i=0
P
Ci,j
(3.145)
i=0
(k)
77
Estimation of Fbj . The estimators Fbj are then iteratively determined via (3.115).
For k 1
i (j+1)
P
Ci,j+1
(k)
Fbj
i=0
i (j+1)
P
i=0
j
i,j
j
j
(3.146)
Ci,j
(k)
(k1) 2
2
c
2
j +abj Fbj
Ci,j
Remarks 3.26
In all examples we have looked at we have observed very fast convergence
(k)
(k)
of b2 and Fb in the sense that we have not observed any changes in the
j
E Fi,j+1 F
1=
i (j + 1) i=0 j2 + a2j fj2 Ci,j
The difficulty with (3.147) is that it again leads to an implicit expression for
bj2 .
The formula for the MSEP, Estimator 3.24, was derived under the assumption
that the underlying model parameters fj , j and aj are known, if we replace
these parameters by their estimates (as it is described via the iteration in this
section) we obtain additional sources for the estimation errors! However, since
calculations get too tedious (or even impossible) we omit further derivations
of the MSEP and take Estimator 3.24 as a first approximation.
We close this section with an example:
Example 3.27 (MSEP in the enhanced chain-ladder model)
We choose two portfolios: Portfolio A and Portfolio B. Both are of similar type
(i.e. consider the same line of business), moreover Portfolio B is contained in Portfolio A.
Portfolio A
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
2
156159
175502
193823
220123
219680
219201
214692
212443
214161
288676
266497
268848
237102
261213
239800
3
156759
176533
196324
222731
220978
220469
220040
214108
215982
290036
269130
270787
244847
264755
4
157583
176989
198632
222916
221276
222751
223467
214661
217962
292206
269404
271624
245940
5
158666
177269
200299
223320
223724
223958
223754
214610
220783
294531
269691
271688
6
160448
178488
202740
223447
223743
224005
223752
214564
221078
294671
269720
7
160552
178556
203848
223566
223765
224030
223593
214484
221614
294705
8
160568
178620
204168
227103
223669
223975
223585
214459
221616
0
1.4416
18.3478
1
1.0278
8.7551
2
1.0112
3.9082
3
1.0057
2.2050
4
1.0048
2.1491
5
1.0025
2.0887
6
1.0008
0.8302
7
1.0020
2.4751
8
1.0010
1.0757
9
1.0001
0.1280
9
160617
178621
205560
227127
223601
224048
223688
214459
1
154622
171449
189682
217342
212770
213352
209969
207644
209604
282621
260308
263130
230607
239723
233309
246019
fbj
bj
0
111551
116163
127615
147659
157495
154969
152833
144223
145612
196695
181381
177168
156505
157839
159429
169990
173377
10
160621
178644
205562
227276
223558
224036
223697
78
Chapter 3. Chain-ladder models
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
79
CL reserves
7
8
9
10
11
12
13
14
15
16
Total
20
231
898
1044
1731
2747
4487
6803
14025
90809
122795
CL 1/2
[
d
msep
[ Ci,J |DI (C
)
i,J
1/2
CL
d C
d
Var
|DI
i,J
64
543
1582
1573
1957
2169
2563
3169
5663
10121
13941
59
510
1468
1470
1838
2055
2426
3030
5443
9762
12336
23
187
589
560
674
693
826
928
1564
2669
6495
322.0%
235.2%
176.1%
150.7%
113.1%
79.0%
57.1%
46.6%
40.4%
11.1%
11.4%
300.4%
220.8%
163.4%
140.9%
106.2%
74.8%
54.1%
44.5%
38.8%
10.8%
10.0%
115.8%
80.9%
65.5%
53.7%
38.9%
25.2%
18.4%
13.6%
11.2%
2.9%
5.3%
Table 3.9: Reserves and conditional MSEP in Macks Model 3.2 for Portfolio A
We compare these results now to the estimates in the Model 3.16: We set r = 5%
and obtain the parameter estimates given below.
Remark. In practice aj can only be determined with the help of external know how
and market data. Therefore, e.g. for solvency purposes, aj should be determined
a priori by the regulator. It answers the question how good can an actuarial
estimate at most be?.
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
cj (3)
F
cj (1)
cj (2)
cj (3)
cj (1)
F
cj (2)
F
3
1.0057
0.0833%
0.0624%
4
1.0048
0.0552%
0.0453%
5
1.0025
0.0316%
0.0251%
6
1.0008
0.0193%
0.0119%
1.01123
3.87516
3.87516
3.87516
1.01123
1.01123
1.00572
2.18642
2.18642
2.18642
1.00572
1.00572
1.00477
2.13924
2.13924
2.13924
1.00477
1.00477
1.00249
2.08568
2.08568
2.08568
1.00249
1.00249
1.00082
0.82851
0.82851
0.82851
1.00082
1.00082
1.00200
2.47435
2.47435
2.47435
1.00200
1.00200
1.00095
1.07546
1.07546
1.07546
1.00095
1.00095
1.0010
0.0052%
0.0052%
1.02784
8.68855
8.68856
8.68856
1.02784
7
1.0020
0.0152%
0.0143%
1.44152
1.44152
15.82901
15.82926
15.82926
2
1.0112
0.1379%
0.1099%
1.02784
1.0278
0.2697%
0.2317%
1.44152
1.4416
1.7187%
1.6974%
1.00009
0.12802
0.12802
0.12802
1.00009
1.00009
1.0001
0.0005%
0.0005%
Already after 3 iterations the parameters have sufficiently converged such that the reserves are stable.
cj (0) = fbj
F
c
Vj
abj
0.0000%
0.0000%
10
80
Chapter 3. Chain-ladder models
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
20
231
898
1044
1731
2747
4489
6804
14024
90796
122784
7
8
9
10
11
12
13
14
15
16
Total
64
543
1581
1573
1956
2165
2556
3153
5627
9244
13298
321.9%
235.1%
176.0%
150.7%
113.0%
78.8%
56.9%
46.3%
40.1%
10.2%
10.8%
(CL,2) 1/2
[
d
msep
[ Ci,J |DI (C
)
i,J
59
510
1468
1470
1836
2051
2418
3013
5405
8844
11598
300.4%
220.8%
163.4%
140.8%
106.1%
74.7%
53.9%
44.3%
38.5%
9.7%
9.4%
300.4%
220.7%
163.3%
140.7%
106.0%
74.5%
53.6%
44.0%
38.2%
8.4%
8.7%
process error1/2
1
12
45
52
87
137
224
340
701
4540
4615
5.0%
5.0%
5.0%
5.0%
5.0%
5.0%
5.0%
5.0%
5.0%
5.0%
3.8%
prediction error1/2
23
187
589
560
674
693
826
928
1565
2688
6504
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
Portfolio B
We choose now a second portfolio B, which includes similar business as our example given in Table 3.7 (Portfolio A). In fact,
Portfolio B is a sub-portfolio of Portfolio A given in Table 3.13 containing exactly the same line of business. Therefore we assume
that the conditional prediction errors are the same as in Table 3.12.
Comment. The resulting reserves are almost the same in the Mack Model 3.2 and in Model 3.16. We obtain now both, a
conditional process error and a conditional prediction error term. The sum of these two terms has about the same size as the
conditional process error in Macks method. This comes from the fact that we use the same data to estimate the parameters. But
the error term in the enhanced chain-ladder model is now bounded from below by the conditional prediction error, whereas the
conditional process error in the Mack model converges to zero for increasing volume.
115.8%
80.9%
65.5%
53.7%
38.9%
25.2%
18.4%
13.6%
11.2%
3.0%
5.3%
1/2
(CL,2)
d C
d
Var
|DI
i,J
Table 3.12: Reserves and conditional MSEP in Model 3.16 for Portfolio A
CL reserves
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
cj (3)
F
cj (1)
cj (2)
cj (3)
cj (1)
F
cj (2)
F
2
74548
89303
97648
106546
108406
108779
111455
101347
103657
147211
147683
151029
129969
150062
128322
3
75076
90033
99429
106919
108677
109093
116231
102624
104516
147777
149575
151960
131858
152883
4
75894
90058
100462
106934
108838
111366
117896
102629
105297
149506
149710
152645
131972
5
76128
90303
101683
107144
110140
111390
118161
102587
107749
149753
149857
152682
6
77904
91454
103549
107170
110110
111422
118157
102545
107911
149865
149890
7
78008
91472
104642
107225
110111
111448
117940
102500
107949
149899
8
78022
91482
104917
107232
110155
111367
117940
102474
107949
1.01168
3.52827
3.52827
3.52827
1.01168
1.01168
1.00632
2.24269
2.24269
2.24269
1.00632
1.00632
1.00463
2.35370
2.35370
2.35370
1.00463
1.00463
1.00415
2.63740
2.63740
2.63740
1.00415
1.00415
1.00102
1.10600
1.10600
1.10600
1.00102
1.00102
1.00026
0.30292
0.30292
0.30292
1.00026
1.00026
1.00088
0.69058
0.69058
0.69058
1.00088
1.00088
9
78071
91483
105560
107232
110155
111369
117972
102474
1.03310
10.86582
10.86582
10.86582
1.03310
1.43999
11.16524
11.16676
11.16676
1.03310
1.43999
1
73067
87679
95734
105349
105630
105987
108835
98753
101911
144167
143742
147042
126032
131721
124048
129970
1.43999
0
53095
59183
64640
72150
76272
75469
78835
70780
73311
102741
97797
98682
86067
87013
83678
90415
86382
0.99996
0.05593
0.05593
0.05593
0.99996
0.99996
10
78075
91494
105560
107232
110110
111369
117974
82
Chapter 3. Chain-ladder models
19
242
318
557
1232
1453
1835
2144
4726
5885
8933
-485.1%
265.6%
191.6%
174.4%
128.3%
100.6%
69.2%
57.2%
57.5%
12.8%
14.1%
18
228
293
519
1159
1381
1734
2051
4545
5652
7967
-453.9%
249.8%
176.4%
162.5%
120.6%
95.6%
65.4%
54.7%
55.3%
12.3%
12.6%
-453.8%
249.7%
175.8%
162.2%
120.5%
95.4%
65.3%
54.5%
55.1%
11.3%
12.0%
process error1/2
0
6
23
29
49
74
130
182
373
2273
2316
-12.0%
6.2%
13.7%
9.1%
5.1%
5.1%
4.9%
4.9%
4.5%
5.0%
3.6%
prediction error1/2
7
82
124
202
419
452
599
625
1295
1639
4041
-171.1%
90.5%
74.7%
63.3%
43.7%
31.3%
22.6%
16.7%
15.7%
3.6%
6.4%
1/2
(CL,2)
d C
d
Var
|DI
i,J
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
A more conservative model would be to assume total dependence for the conditional prediction errors between the accident
years, i.e. then we would not observe any diversification of the conditional prediction error between the accident years.
The conditional prediction errors in portfolio A and portfolio B slightly differ since we choose different development factors
Fbj and since the relative weights Ci,Ii between the accident years i differ in portfolio A and portfolio B.
The error terms between portfolio A and portfolio B are now directly comparable: The conditional prediction errors are
the same for both portfolios. The conditional process error decreases now from portfolio B to portfolio A by about factor
2, since portfolio A has about twice the size of portfolio B. The conditional estimation error decreases from portfolio B to
portfolio A since in portfolio A we have more data to estimate the parameters.
Comments.
-4
91
166
320
961
1445
2650
3749
8224
45878
63480
7
8
9
10
11
12
13
14
15
16
Total
(CL,2) 1/2
[
d
msep
[ Ci,J |DI (C
)
i,J
Table 3.15: Reserves and conditional MSEP in Model 3.16 for Portfolio B
CL reserves
84
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
Chapter 4
Bayesian models
4.1
In the broadest sense, Bayesian methods for claims reserving can be considered
as methods in which one combines expert knowledge or existing a priori information with observations resulting in an estimate for the ultimate claim. In the
simplest case this a priori knowledge/information is given e.g. by a single value like
an a priori estimate for the ultimate claim or for the average loss ratio (see this
section and the following section). However, in a strict sense the a priori knowledge/information in Bayesian methods for claims reserving is given by an a priori
distribution of a random quantity such as the ultimate claim or a risk parameter.
The Bayesian inference is then understood to be the process of combining the a priori distribution of the random quantity with the data given in the upper trapezoid
via Bayes theorem. In this manner it is sometimes possible to obtain an analytic
expression for the a posteriori distribution of the ultimate claim that reflects the
change in the uncertainty due to the observations. The a posteriori expectation of
the ultimate claim is then called the Bayesian estimator for the ultimate claim
and minimizes the quadratic loss in the class of all estimators which are square integrable functions of the observations (see Section 4.2). In cases where we are not
able to explicitly calculate the a posteriori expectation of the ultimate we restrict
the search of the best estimator to the smaller class of estimators, which are linear
functions of the observations (see Sections 4.3, 4.4 and 4.5).
85
86
4.1.1
Benktander-Hovinen method
This method goes back to Benktander [8] and Hovinen [37]. They have developed
independently a method which leads to the same total estimated loss amount.
Choose i > I J. Assume we have an a priori estimate i for E[Ci,J ] and that
the claims development pattern (j )0jJ with E[Ci,j ] = i j is known. Since
the Bornhuetter-Ferguson method completely ignores the observations Ci,Ii on the
last observed diagonal and the chain-ladder method completely ignores the a priori
estimate i at hand, one could consider a credibility mixture of these two methods
(see (2.23)-(2.24)): For c [0, 1] we define the following credibility mixture
CL
d
Si (c) = c C
+ (1 c) i
i,J
(4.1)
CL
d
is the chain-ladder estimate for the ultimate
for I J + 1 i I, where C
i,J
claim. The parameter c should increase with developing Ci,Ii since we have more
information in Ci,j with increasing time. Benktander [8] proposed to choose c =
Ii . This leads to the following estimator:
Estimator 4.1 (Benktander-Hovinen estimator) The BH estimator is given
by
BH
CL
d
d
(4.2)
C
= Ci,Ii + (1 Ii ) Ii C
+ (1 Ii ) i
i,J
i,J
for I J + 1 i I.
Observe that we could again identify the claims development pattern (j )0jJ with
the chain-ladder factors (fj )0j<J . This can be done if we use Model Assumptions
2.9 for the Bornhuetter-Ferguson motivation, see also (2.22). Henceforth we identify
in the sequel of this section
1
j = QJ1
k=j
fk
(4.3)
Since the development pattern j is known we also have (using (4.3)) known chainladder factors, which implies that we set
fj = fbj
(4.4)
d
= Ci,Ii + (1 Ii ) C
i,J
BF
(4.5)
(4.6)
87
Remarks 4.2
Equation (4.6) shows that the Benktander-Hovinen estimator can be seen as
an iterated Bornhuetter-Ferguson estimator using the BF estimate as new a
priori estimate.
The following lemma shows that the weighting Ii is not a fixe point of our
iteration since we have to evaluate the BH estimate at 1 (1 Ii )2 .
Lemma 4.3 We have that
BH
d
C
= Si 1 (1 Ii )2
i,J
(4.7)
for I J + 1 i I.
Proof. It holds that
BH
CL
d
d
C
=
C
+
(1
C
+
(1
i,J
i,Ii
Ii
Ii
i,J
Ii
i
CL
CL
2
d
d
= Ii C
+ Ii Ii
C
+ (1 Ii )2 i
(4.8)
i,J
i,J
CL
d
= 1 (1 Ii )2 C
+ (1 Ii )2 i = Si 1 (1 Ii )2 .
i,J
This finishes the proof of the lemma.
2
Example 4.4 (Benktander-Hovinen method)
We revisit the data set given in Examples 2.7 and 2.11. We see that the Benktanderestimated reserves
i
0
1
2
3
4
5
6
7
8
9
Total
Ci,Ii
11148124
10648192
10635751
9724068
9786916
9935753
9282022
8256211
7648729
5675568
i
11653101
11367306
10962965
10616762
11044881
11480700
11413572
11126527
10986548
11618437
Ii
100.0%
99.9%
99.8%
99.6%
99.1%
98.4%
97.0%
94.8%
88.0%
59.0%
CL
d
C
i,J
11148124
10663318
10662008
9758606
9872218
10092247
9568143
8705378
8691971
9626383
BH
d
C
i,J
11148124
10663319
10662010
9758617
9872305
10092581
9569793
8711824
8725026
9961926
CL
BH
BF
15126
26257
34538
85302
156494
286121
449167
1043242
3950815
6047061
15127
26259
34549
85389
156828
287771
455612
1076297
4286358
6424190
16124
26998
37575
95434
178024
341305
574089
1318646
4768384
7356580
Hovinen reserves are in between the chain-ladder reserves and the BornhuetterFerguson reserves. They are closer to the chain-ladder reserves because Ii is
larger than 50% for all accident years i {0, . . . , I}.
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
88
The next theorem is due to Mack [51]. It says that if we further iterate the BF
method, we arrive at the chain-ladder reserve:
b(0) = i and define for m 0
Theorem 4.5 (Mack [51]) Choose C
b(m+1) = Ci,Ii + (1 Ii ) C
b(m) .
C
(4.9)
If Ii > 0 then
CL
b(m) = C
d
lim C
.
i,J
(4.10)
(4.11)
The claim is true for m = 1 (BF estimator) and for m = 2 (BH estimator, see
Lemma 4.3). Hence we prove the claim by induction. Induction step m m + 1:
b(m+1) = Ci,Ii + (1 Ii ) C
b(m)
C
(4.12)
CL
m
m
d
= Ci,Ii + (1 Ii ) 1 (1 Ii ) Ci,J + (1 Ii ) i
CL
CL
d
d
+ (1 Ii )m+1 i ,
= Ii C
+ (1 Ii ) (1 Ii )m+1 C
i,J
i,J
which proves (4.11). But from (4.11) the claim of the theorem immediately follows.
2
Example 4.4, revisited
In view of Theorem 4.5 we have
0
1
2
3
4
5
6
7
8
9
BF
b (1) = C
d
C
i,J
11148124
10664316
10662749
9761643
9882350
10113777
9623328
8830301
8967375
10443953
BH
b (2) = C
d
C
i,J
11148124
10663319
10662010
9758617
9872305
10092581
9569793
8711824
8725026
9961926
b (3)
C
11148124
10663318
10662008
9758606
9872218
10092252
9568192
8705711
8695938
9764095
b (4)
C
11148124
10663318
10662008
9758606
9872218
10092247
9568144
8705395
8692447
9682902
b (5)
C
11148124
10663318
10662008
9758606
9872218
10092247
9568143
8705379
8692028
9649579
...
...
...
...
CL
b () = C
d
C
i,J
11148124
10663318
10662008
9758606
9872218
10092247
9568143
8705378
8691971
9626383
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
4.1.2
89
We choose i > I J and define the reserves for accident year i (see also (1.43))
Ri = Ci,J Ci,Ii .
(4.13)
Hence under the assumption that the development pattern and the chain-ladder
factors are known (and identified by (4.3) under Model Assumptions 2.9) the chainladder reserve and the Bornhuetter-Ferguson reserve are given by
ci
R
CL
d
= C
i,J
CL
Ci,Ii = Ci,Ii
J1
Y
!
fj 1 ,
(4.14)
j=Ii
ci
R
BF
d
= C
i,J
BF
Ci,Ii = (1 Ii ) i .
(4.15)
(4.16)
ci (c) = c R
ci CL + (1 c) R
ci BF
R
CL
d
= (1 Ii ) c Ci,J + (1 c) i
(4.17)
d
=C
i,J
BF
d
Ci,Ii + c (C
i,J
CL
d
C
i,J
BF
90
Assume that different accident years i are independent. There exists a sequence
(j )0jJ with J = 1 such that we have for all j {0, . . . , J}
E [Ci,j ] = j E [Ci,J ] .
(4.19)
Remarks 4.7
Model Assumptions 4.6 coincide with Model Assumptions 2.9 if we assume
that Ui = i is deterministic.
Observe that we do not assume that the chain-ladder model is satisfied! The
chain-ladder model satisfies Model Assumptions 4.6 but not necessarily vica
versa. Assume that fj is identified with j (via (4.3)) and that
CL
Ci,Ii
d
C
=
i,J
Ii
BF
d
C
= Ci,Ii + (1 Ii ) Ui .
i,J
and
(4.20)
(4.21)
(4.22)
Observe that also if we would assume that the chain-ladder model is satisfied
we could not directly compare this situation to the mean square error of
prediction calculation in Chapter 3. For the derivation of a MSEP formula
for the chain-ladder method we have always assumed that the chain-ladder
factors fj are not known. If they would be known the mean square error of
prediction of the chain-ladder reserves simply is given by (see (3.30))
CL
CL 2
d
d
DI
msepCi,J Ci,J
= E E Ci,J Ci,J
h
i
CL
d
= E msepC |D C
i,J
i,J
= E [Var ( Ci,J | DI )]
(4.23)
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
91
If we calculate (4.17) under Model Assumptions 4.6 and with (4.20) we obtain
bi (c) = E[Ri ] and
E R
msepRi
2
ci (c) = Var (Ri ) + E E[Ri ] R
ci (c)
R
h
i
c
+2 E Ri E[Ri ] E[Ri ] Ri (c)
ci (c) 2 Cov Ri , R
ci (c) .
= Var (Ri ) + Var R
(4.24)
Cov(Ci,Ii , Ri ) + Ii (1 Ii )Var(Ui )
Ii
.
2
1 Ii
Var(Ci,Ii ) + Ii
Var(Ui )
(4.25)
1 Ii
E (Ci,Ii Ii Ui )2
Since E[Ii Ui ] = E[Ci,Ii ] and E[Ui ] = E[Ci,J ] we obtain
Ii
1 Ii
Ii
=
1 Ii
ci =
Cov(Ci,Ii Ii Ui , Ri (1 Ii ) Ui )
Var(Ci,Ii Ii Ui )
Cov(Ci,Ii , Ri ) + Ii (1 Ii ) Var(Ui )
.
2
Var(Ci,Ii ) + Ii
Var(Ui )
(4.28)
92
We would like to mention once more that we have not considered the estimation
errors in the claims development pattern j and fj , respectively. In this sense
Theorem 4.8 is a statement giving optimal credibility weights considering process
variance and the uncertainty in the a priori estimate Ui .
Remark. To explicitly calculate ci in Theorem 4.8 we need to specify an explicit
stochastic model. We will do this below in Section 4.1.4, and close this subsection
for the moment.
4.1.3
Cape-Cod Model
One main deficiency in the chain-ladder model is that the chain-ladder model completely depends on the last observation on the diagonal (see Chain-ladder Estimator
2.4). If this last observation is an outlier, this outlier will be projected to the ultimate claim (using the age-to-age factors). Often in long-tailed lines of business
the first observations are not always representative. One possibility to smoothen
outliers on the last observed diagonal is to combine BF and CL methods as e.g. in
the Benktander-Hovinen method, another possibility is to robustify such observations. This is done in the Cape-Cod method. The Cape-Cod method goes back to
B
uhlmann [15].
Model Assumptions 4.9 (Cape-Cod method)
There exist parameters 0 , . . . , I > 0, > 0 and a claims development pattern
(j )0jJ with J = 1 such that
E[Ci,j ] = i j
(4.29)
bi =
i
Ii i
j=Ii fj i
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
(4.30)
93
b =
bi = PI i=0
.
PI
Ik
k
Ii
i
k=0
i=0
i=0
(4.31)
(4.32)
Observe that
bCC is an unbiased estimate for .
A robustified value for Ci,Ii is then found by (i > I J)
C[
i,Ii
CC
=
bCC i Ii .
(4.33)
CC
J1
Y
fj C[
i,Ii
CC
(4.34)
j=Ii
for I J + 1 i I.
CC
d
Lemma 4.11 Under Model Assumptions 4.9 and (4.3) the estimator C
i,J
Ci,Ii is unbiased for E [Ci,J Ci,Ii ] = i (1 Ii ).
(4.35)
CC
Ci,Ii = C[
i,Ii
CC
J1
Y
!
fj 1
=
bCC i (1 Ii ) .
(4.36)
j=Ii
(4.37)
bCC i .
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
94
2i
1
Var (Ci,Ii ) .
2
Ii
(4.38)
According to the choice of the variance function of Ci,j this may also suggest
that the robustification can be done in an other way (with smaller variance),
see also Lemma 3.4.
Example 4.13 (Cape-Cod method)
We revisit the data set given in Examples 2.7, 2.11 and 4.4.
estimated reserves
0
1
2
3
4
5
6
7
8
9
i
15473558
14882436
14456039
14054917
14525373
15025923
14832965
14550359
14461781
15210363
bCC
bi
72.0%
71.7%
73.8%
69.4%
68.0%
67.2%
64.5%
59.8%
60.1%
63.3%
67.3%
CC
\
C
i,Ii
10411192
9999259
9702614
9423208
9688771
9953237
9681735
9284898
8562549
6033871
CC
d
C
i,J
11148124
10662396
10659704
9757538
9871362
10092522
9580464
8761342
8816611
9875801
Cape-Cod
0
14204
23953
33469
84446
156769
298442
505131
1167882
4200233
6484530
Total
CL
0
15126
26257
34538
85302
156494
286121
449167
1043242
3950815
6047061
BF
0
16124
26998
37575
95434
178024
341305
574089
1318646
4768384
7356580
loss ratio
70.0%
65.0%
60.0%
55.0%
50.0%
0
accident years
kappa i
kappa CC
b
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
95
CC
bi /i are all above 75%, whereas the Cape-Cod method gives loss ratios
bi , which
are all below 75% (see Figure 4.1).
However, as Figure 4.1 shows: We have to be careful with the assumption of
constant loss ratios . The figure suggests that we have to consider underwriting
cycles carefully. In soft markets, loss ratios are rather low (we are able to charge
rather high premiums). If there is a keen competition we expect low profit margins.
If possible, we should adjust our premium with underwriting cycle information.
For this reason one finds in practice modified versions of the Cape-Cod method,
e.g. smoothening of the last observed diagonal is only done over neighboring values.
4.1.4
(4.39)
(4.40)
96
In view of Theorem 4.8 we have the following corollary (use definitions (4.21) and
(4.20)):
Corollary 4.16 Under the assumption that Ui is an unbiased estimator for E[Ci,J ]
which is independent of Ci,Ii and Ci,J and Model Assumption 4.14 the optimal
credibility factor ci which minimizes the (unconditional) mean square error of prediction (4.22) is given by
ci =
E [2 (Ci,J )]
Var(Ui ) + Var(Ci,J ) E [2 (Ci,J )]
(4.41)
Ii
Cov(Ci,Ii , Ci,J Ci,Ii ) + Ii (1 Ii ) Var(Ui )
.
2
1 Ii
Var(Ci,Ii ) + Ii
Var(Ui )
(4.42)
Ii
Ii + ti
with
ti =
for i {I J + 1, . . . , I}.
Proof. From Theorem 4.8 we have
ci =
(4.43)
(4.44)
and
(4.45)
97
(4.46)
Hence we obtain
Ii
Ii Var(Ci,J ) Var(Ci,Ii ) + Ii (1 Ii ) Var(Ui )
2
1 Ii
Var(Ci,Ii ) + Ii
Var(Ui )
2
Var(Ci,J ) E [ (Ci,J )] + Var(Ui )
=
(4.47)
1
Ii 1 E [2 (Ci,J )] + Var(Ci,J ) + Var(Ui )
Var(Ci,J ) E [2 (Ci,J )] + Var(Ui )
=
.
1
Ii
E [2 (Ci,J )] + Var(Ci,J ) E [2 (Ci,J )] + Var(Ui )
ci =
c)
ci (c) = E (Ci,J )
msepRi R
+
+
(1 Ii )2 ,
Ii 1 Ii
ti
2
1
1
ci (0) = E (Ci,J )
+
(1 Ii )2 ,
msepRi R
1 Ii ti
2
1
1
c
msepRi Ri (1) = E (Ci,J )
+
(1 Ii )2 ,
Ii 1 Ii
2
1
1
c
+
(1 Ii )2
(4.48)
msepRi Ri (ci ) = E (Ci,J )
Ii + ti 1 Ii
for i {I J + 1, . . . , I}.
Proof. Exercise.
2
Remarks 4.18
ci (0) corresponds to the Bornhuetter-Ferguson reserve R
ci BF and
The reserve R
ci (1) corresponds to the chain-ladder reserve R
ci CL . However, msepR R
ci (1)
R
i
ci CL from Section 3 are not comparable since a) we use a comand msepR R
i
pletely different model, which leads to different process error and prediction
error terms; b) in Corollary 4.17 we do not investigate the estimation error
coming from the fact that we have to estimate fj and j .
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
98
(4.49)
I.e. for years with small loss experience Ii one should take the BF estimate
whereas for older years one should take the CL estimate. Similar estimates
can be derived for the BH estimate.
Example 4.19 (Mack model, Model Assumptions 4.14)
An easy distributional example satisfying Model Assumptions 4.14 is the following.
Assume that, conditionally given Ci,J , Ci,j /Ci,J has a Beta i j , i (1 j ) distribution. Hence
Ci,j
Ci,J = j Ci,J ,
(4.50)
E [ Ci,j | Ci,J ] = Ci,J E
Ci.J
2
Ci,J
Ci,j
2
Var (Ci,j | Ci,J ) = Ci,J Var
C
=
(1
(4.51)
i,J
j
j
Ci,J
1 + i
for all i = 0, . . . , I and j = 0, . . . , J.
See appendix, Section B.2.4, for the definition of the Beta distribution and its
moments.
We revisit the data set given in Examples 2.7, 2.11 and 4.4. Observe that
E 2 (Ci,J ) =
2 E [Ci,J ]2
1
E Ci,J
Vco2 (Ci,J ) + 1 .
=
1 + i
1 + i
(4.52)
1/2
(4.53)
where r = 6% corresponds to the pure process error. This leads to the results in
Table 4.4.
We already see from the choices of our parameters i , r and Vco(Ui ) that it is
rather difficult to apply this method in practice, since we have not estimated these
parameters from the data available.
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
0
1
2
3
4
5
6
7
8
9
i
600
600
600
600
600
600
600
600
600
600
Vco(Ui )
5.0%
5.0%
5.0%
5.0%
5.0%
5.0%
5.0%
5.0%
5.0%
5.0%
r
6.0%
6.0%
6.0%
6.0%
6.0%
6.0%
6.0%
6.0%
6.0%
6.0%
Vco(Ci,J )
7.8%
7.8%
7.8%
7.8%
7.8%
7.8%
7.8%
7.8%
7.8%
7.8%
99
ti
24.2%
24.2%
24.2%
24.2%
24.2%
24.2%
24.2%
24.2%
24.2%
24.2%
ci
80.5%
80.5%
80.5%
80.5%
80.4%
80.3%
80.1%
79.7%
78.5%
70.9%
Total
estimated reserves
ci (c )
CL
BF
R
i
0
0
0
15126
16124
15320
26257
26998
26401
34538
37575
35131
85302
95434
87288
156494
178024
160738
286121
341305
297128
449167
574089
474538
1043242 1318646 1102588
3950815 4768384 4188531
6047061 7356580 6387663
0
1
2
3
4
5
6
7
8
9
total
E[Ui ]
11653101
11367306
10962965
10616762
11044881
11480700
11413572
11126527
10986548
11618437
ci (1))
msep1/2 (R
ci (0))
msep1/2 (R
ci (c ))
msep1/2 (R
i
17529
22287
25888
42189
58952
81990
106183
166013
396616
457811
17568
22373
26031
42751
60340
85604
113911
190514
500223
560159
17527
22282
25879
42153
58862
81745
105626
163852
372199
435814
4.2
4.2.1
Bayesian methods for claims reserving are methods in which one combines a priori information or expert knowledge with observations in the upper trapezoid DI .
Available information/knowledge is incorporated through an a priori distribution
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
100
of a random quantity such as the ultimate claim (see Sections 4.2.2 and 4.2.3)
or a risk parameter (see Section 4.2.4) which must be modeled by the actuary.
This distribution is then connected with the likelihood function via Bayes theorem. If we use a smart choice for the distribution of the observations and the a
priori distribution such as the exponential dispersion family (EDF) and its associate conjugates (see Section 4.2.4), we are able to derive an analytic expression
for the a posteriori distribution of the ultimate claim. This means that we can
compute the a posteriori expectation E [Ci,J |DI ] of the ultimate claim Ci,J which
is called Bayesian estimator for the ultimate claim, given the observations DI .
The Bayesian method is called exact since the Bayesian estimator E [Ci,J |DI ] is
optimal in the sense that it minimizes the squared loss function (MSEP) in the
class L2Ci,J (DI ) of all estimators for Ci,J which are square integrable functions of
the observations in DI , i.e.
E [Ci,J |DI ] = argmin E (Ci,J Y )2 DI .
Y L2C
i,J
(4.54)
(DI )
(4.55)
2
b
b
E [Ci,J |DI ] = Var(Ci,J |DI ) + E [Ci,J |DI ] E [Ci,J |DI ] ,
(4.56)
and now we are in a similar situation as in the chain-ladder model, see (3.30).
We close this section with some remarks: For pricing and tariffication of insurance
contracts Bayesian ideas and techniques are well investigated and widely used in
practice. For the claims reserving problem Bayesian methods are less used although
we believe that they are very useful for answering practical questions (this has
e.g. already be mentioned in de Alba [2]).
In the literature exact Bayesian models have been studied e.g. in a series of papers
by Verrall [79, 81, 82], de Alba [2, 4], de Alba-Corzo [3], Gogol [30], HaastrupArjas [32] Ntzoufras-Dellaportas [60] and the corresponding implementation by
Scollnik [70]. Many of these results refer to explicit choices of distributions, e.g. the
Poisson-gamma or the (log-)normal-normal cases are considered. Below, we give
an approach which suites for rather general distributions (see Section 4.2.4).
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
4.2.2
101
Log-normal/Log-normal model
j2 = j2 (Ci,J ) = log 1 +
2
j
Ci,J
!
,
1
1 j 2 (Ci,J )
j = j (Ci,J ) = log (j Ci,J ) log 1 +
2
2
j
Ci,J
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
(4.59)
!
. (4.60)
102
exp
(2)1/2 i y
2
i
(
2
2 )
1
1
1 log(x) j (y)
1 log(y) (i)
=
exp
.
2 i j (y) x y
2
j (y)
2
i
Lemma 4.21 The Model Assumptions 4.20 combined with Model Assumptions
4.14 with 2 (c) = a2 c2 for some a R satisfies the following equalities
1 j 2
=
= log 1 +
a ,
j
j (c) = log c + log j j2 /2.
j2 (c)
j2
(4.62)
(4.63)
j2
1 2
i + j2
j2 /2 + log(Ci,j /j ) +
j2
2.
i2 + j2 i
j2
(i) , (4.64)
i2 + j2
(4.65)
Remarks 4.22
This example shows a typical Bayesian and credibility result: i) In this example of conjugated distributions we can exactly calculate the a posteriori
distribution of the ultimate claim Ci,J given the information Ci,j (cf. Section
4.2.4 and see also B
uhlmann-Gisler [18]). ii) We see that we need to update
the parameter (i) by choosing a credibility weighted average of the a priori
parameter (i) and the transformed observation j2 /2 + log(Ci,j /j ), where
the credibility weight is given by
i,j = i2 /(i2 + j2 ).
(4.66)
This implies the updating of the a priori mean of the ultimate claim Ci,J
E[Ci,J ] = exp{(i) + i2 /2}
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
(4.67)
103
= exp (1 i,j )
(i)
i2 /2
+ i,j log j +
j2 /2
(4.68)
i2
2 + 2
i
j
Ci,j
Pj
post(i,j) =
= i,j
Pj
1
k=0 k2
(i)
i2
(4.69)
j
X
log Ci,k log k + 2
k
k2
k=0
with
(i)
+ 1 i,j
,
Pj
i,j
= Pj
1
k=0 k2
1
k=0 k2
1
i2
and variance
j
X
1
1
+ 2
=
2
i
k=0 k
"
2,
post(i,j)
(4.70)
#1
.
(4.71)
Observe that this is again a credibility weighted average between the a priori
estimate (i) and the observations Ci,0 , . . . , Ci,j . The credibility weights are
given by i,j
. Moreover, observe that this model does not have the Markov
property, this is in contrast to our chain-ladder assumptions.
Proof of Lemma 4.21. The equations (4.62)-(4.63) easily follow from (4.59)(4.60). Hence we only need to calculate the conditional distribution of Ci,J given
Ci,j . From (4.61) and (4.63) we see that the joint density of (Ci,j , Ci,J ) is given by
1
1
(4.72)
fCi,j ,Ci,J (x, y) =
2 i j x y
(
2
2 )
1 log(x) log(y) log j + j2 /2
1 log(y) (i)
exp
.
2
j
2
i
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
104
zc
2
+
2
2 c+ 2
2 + 2
2
2 2
2 + 2
( c)2
.
2 + 2
(4.73)
2 i j x y
2
2 c(x)+ 2 (i)
2
1
(i)
log y i 2 + j2
c(x)
i
j
+
exp
,
i2 j2
i2 + j2
2
(4.74)
i2 +j2
where
c(x) = log(x) log j + j2 /2.
(4.75)
fC ,C (x, y)
fCi,j ,Ci,J (x, y)
= R i,j i,J
fCi,j (x)
fCi,j ,Ci,J (x, y)dy
(4.76)
i2 c(Ci,j ) + j2 (i)
=
,
i2 + j2
2
post(i,j)
=
i2 j2
.
i2 + j2
(4.77)
(4.78)
i2 log(Ci,j ) log j + j2 /2 + j2 (i)
=
.
i2 + j2
(4.79)
for I J + 1 i I.
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
105
Observe that we only condition on the last observation Ci,Ii , see also Remarks
4.22 on Markov property.
Remark. We could also consider
Go,2
Go
d
d
C
= Ci,Ii + (1 Ii ) C
.
i,J
i,J
(4.81)
Go,2
d
From a practical point of view C
is more useful, if we have an outlier on the
i,J
diagonal. However, both estimators are not easily obtained in practice, since there
are too many parameters which are difficult to estimate.
(4.82)
0
1
2
3
4
5
6
7
8
9
i = E[Ci,J ]
11653101
11367306
10962965
10616762
11044881
11480700
11413572
11126527
10986548
11618437
Vco(Ci,J )
7.8%
7.8%
7.8%
7.8%
7.8%
7.8%
7.8%
7.8%
7.8%
7.8%
(i)
16.27
16.24
16.21
16.17
16.21
16.25
16.25
16.22
16.21
16.27
i
7.80%
7.80%
7.80%
7.80%
7.80%
7.80%
7.80%
7.80%
7.80%
7.80%
Ii
100.0%
99.9%
99.8%
99.6%
99.1%
98.4%
97.0%
94.8%
88.0%
59.0%
a2
0.17%
0.17%
0.17%
0.17%
0.17%
0.17%
0.17%
0.17%
0.17%
0.17%
Ii
0.0%
0.2%
0.2%
0.2%
0.4%
0.5%
0.7%
1.0%
1.5%
3.4%
We obtain the credibility weights and estimates for the ultimates in Table 4.7.
CL
C
d
Using (4.66), (4.57) and C
= i,Ii
we obtain for Estimator 4.23 the following
i,J
Ii
representation:
Go
Ci,Ii
(i)
2
2
d
+ Ii /2
Ci,J
= exp (1 i,Ii ) + i /2 + i,Ii log
Ii
n
oi,Ii
CL
1i,Ii
2
d
.
(4.83)
= i
exp log Ci,J + Ii /2
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
106
0
1
2
3
4
5
6
7
8
9
Ci,Ii
11148124
10648192
10635751
9724068
9786916
9935753
9282022
8256211
7648729
5675568
1 i,Ii
0.0%
0.0%
0.1%
0.1%
0.2%
0.4%
0.8%
1.5%
3.6%
16.0%
post(i,Ii)
16.23
16.18
16.18
16.09
16.11
16.13
16.08
15.98
15.99
16.11
post(i,Ii)
0.00%
0.15%
0.20%
0.24%
0.38%
0.51%
0.71%
0.94%
1.48%
3.12%
Go
d
C
i,J
11148124
10663595
10662230
9759434
9874925
10097962
9582510
8737154
8766487
9925132
Go
0
15403
26479
35365
88009
162209
300487
480942
1117758
4249564
6476218
CL
0
15126
26257
34538
85302
156494
286121
449167
1043242
3950815
6047061
BF
0
16124
26998
37575
95434
178024
341305
574089
1318646
4768384
7356580
Hence we obtain a weighted average between the a priori estimate i = E[Ci,J ] and
CL
d
the chain-ladder estimate C
on the log-scale. This leads (together with the
i,J
bias correction) to multiplicative credibility formula. In Table 4.7 we see that the
weights 1 i,Ii given to the a priori mean i are rather low.
For the conditional mean square error of prediction we have
Go
d
= Var(Ci,J |Ci,Ii )
msepCi,J |Ci,Ii C
i,J
2
2
= exp 2 post(i,Ii) + post(i,Ii)
exp post(i,Ii)
1
(4.84)
2
2
= E [Ci,J | Ci,Ii ] exp post(i,Ii) 1
2
Go 2
d
= C
exp post(i,Ii)
1 .
i,J
This holds under the assumption that the parameters j , (i) , i and a2 are known.
Hence it is not directly comparable to the mean square error of prediction obtained
from the chain-ladder model, since we have no canonical model for the estimation
of these parameters and hence we can not quantify the estimation error.
If we want to compare this mean square error of prediction to the ones obtained in
Corollary 4.17 we need to calculate the unconditional version:
Go
Go 2
d
d
msepCi,J Ci,J
= E Ci,J Ci,J
= E [Var(Ci,J |Ci,Ii )]
2
Go 2
d
= E Ci,J
exp post(i,Ii)
1 .
(4.85)
CL
d
Hence we need the distribution of C
= Ci,Ii /Ii (cf. (4.83)). Using (4.74)
i,J
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
107
we obtain
Z
fCi,Ii ,Ci,J (x, y) dy
fCi,Ii (x) =
R+
=
R+
1
2 i2Ii2
i +Ii
2 (i) 2
i2 c(x)+Ii
log y
2
i2 +Ii
1
1
exp
dy
2
i2 Ii
y
2
2
2
i +Ii
{z
}
=1
(4.86)
2
2
log(x/Ii ) + Ii
(i)
2
1
1
exp
.
q
2
i2 + Ii
2
2
2 (i2 + Ii
) x
CL
d
This shows the estimator C
= Ci,Ii /Ii is log-normally distributed with
i,J
(i)
2
2
2
parameters Ii /2 and i +Ii . Moreover, the multiplicative reproductiveness
of the Log-normal distribution implies that for > 0
CL
d
C
i,J
2
2
LN (i) Ii
/2, 2 i2 + Ii
.
(d)
(4.87)
2(1i,Ii )
i
exp i,Ii
2
Ii
exp
2
post(i,Ii)
CL
d
1 E C
i,J
(4.88)
2
i,Ii
2
= exp 2 (i) + (1 i,Ii ) i2 + i,Ii Ii
2
2
2
2
exp i,Ii Ii
+ 2 i,Ii
i2 + Ii
exp post(i,Ii)
1 .
Observe that
2
= i2
i,Ii i2 + Ii
(4.89)
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
108
msepC
i,J |Ci,Ii
0
1
2
3
4
5
6
7
8
9
total
Go
d
C
i,J
16391
21602
23714
37561
51584
68339
82516
129667
309586
359869
1/2
msepC
i,J
Go
d
C
i,J
`
ci (c )
msep1/2 R
i
17526
22279
25875
42139
58825
81644
105397
162982
363331
427850
17527
22282
25879
42153
58862
81745
105626
163852
372199
435814
Table 4.8: Mean square errors of prediction under the assumptions of Lemma 4.21
and in Model 4.14
4.2.3
109
conditionally, given i , the Zi,j are independent and Poisson distributed, and
the incremental variables Xi,j = i Zi,j satisfy
E [Xi,j | i ] = i j
and
Var (Xi,j | i ) = i i j .
(4.91)
The pairs i , (Xi,0 , . . . , Xi,J ) (i = 0, . . . , I) are independent and i is
Gamma distributed with shape parameter ai and scale parameter bi .
2
Remarks 4.27
See appendix, Sections B.1.2 and B.2.3 for the definition of the Poisson and
Gamma distribution.
Observe, given i , the expectation and variance of Zi,j satisfy
E [Zi,j | i ] = Var [Zi,j | i ] =
i j
.
i
(4.92)
(4.93)
J
X
Zi,j .
(4.94)
j=0
and
E [Ci,J | i ] = i ,
(4.95)
this means that i plays the role of the (unknown) total expected claim
amount of accident year i. The Bayesian approach chosen tells us, how we
should combine the a priori expectation E[Ci,J ] = ai /bi and the information
DI .
This model is sometimes problematic in practical applications. It assumes
that we have no negative increments Xi,j . If we count the number of reported
claims this may hold true. However if Xi,j denotes incremental payments, we
can have negative values. E.g. in motor hull insurance in old development
periods one gets more money (via subrogation and repayments of deductibles)
than one spends.
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
110
= bi +
j
X
k /i = bi + j /i ,
(4.96)
(4.97)
k=0
where j =
Pj
k=0
k .
Remarks 4.29
Since accident years are independent it suffices to consider (Xi,0 , . . . , Xi,j ) for
the calculation of the a posteriori distribution of i .
We assume that a priori all accident years are equal (i are i.i.d.). After
we have a set of observations DI , we obtain a posteriori risk characteristics
which differ according to the observations.
Model 4.26 belongs to the well-known class of exponential dispersion models
with associated conjugates (see e.g. B
uhlmann-Gisler [18] in Subsection 2.5.1,
and Subsection 4.2.4 below).
Using Lemma 4.28 we obtain for the a posteriori expectation
apost
i,Ii
ai + Ci,Ii /i
(4.98)
bi + Ii /i
bi
ai
bi
Ci,Ii
+ 1
,
=
bi + Ii /i bi
bi + Ii /i
Ii
E [ i | DI ] =
bpost
i,Ii
111
In fact we can specify the a posteriori distribution of (Ci,J Ci,Ii )/i , given
DI . It holds for k {0, 1, . . .} that
P (Ci,J Ci,Ii )/i = k
post
Z
k (bpost )ai,Ii
post
post
i,Ii
(1Ii ) ((1 Ii ) )
ai,Ii 1 ebi,Ii d
=
e
post
k!
(ai,Ii )
R+
post
ai,Ii
Z
bpost
(1 Ii )k
post
post
i,Ii
=
post
density of k + apost
i,Ii , bi,Ii + 1 Ii
bpost
i,Ii
apost
i,Ii
(1 Ii )k
k + apost
i,Ii
=
(4.99)
k+apost
i,Ii
apost
i,Ii k!
bpost
+
1
Ii
i,Ii
post
!
!k
a
i,Ii
k + apost
bpost
1 Ii
i,Ii
i,Ii
=
k! apost
bpost
bpost
i,Ii
i,Ii + 1 Ii
i,Ii + 1 Ii
post
!ai,Ii
!k
bpost
k + apost
1 Ii
i,Ii
i,Ii 1
=
,
k
bpost
bpost
i,Ii + 1 Ii
i,Ii + 1 Ii
which is a Negative binomial distribution with parameters r = apost
i,Ii and
post
post
p = bi,Ii / bi,Ii + 1 Ii (see appendix Section B.1.3).
Proof. Using (4.92) we obtain for the conditionally density of (Xi,0 , . . . , Xi,j ),
given i , that
fXi,0 ,...,Xi,j |i (x0 , . . . , xj |) =
j
Y
exp{ k /i }
k=0
( k /i )xk /i
.
xk /i
(4.100)
j
Y
k=0
exp{ k /i }
bai
( k /i )xk /i
i ai 1 ebi .
xk /i
(ai )
(4.101)
This shows that the a posteriori distribution of i , given (Xi,0 , . . . , Xi,j ), is again
a Gamma distribution with updated parameters
apost
= ai + Ci,j /i ,
i,j
(4.102)
bpost
= bi +
i,j
k /i .
k=0
(4.103)
112
= Ci,Ii + (1 Ii ) E [i | DI ] .
Together with (4.98) this motivates the following estimator:
Estimator 4.30 (Poisson-gamma model, Verrall [81, 82]) Under Model Assumptions 4.26 we have the following estimator for the ultimate claim E [ Ci,J | DI ]
#
!
"
P oiGa
b
a
C
b
i
i
i,Ii
i
d
+ 1
C
= Ci,Ii + (1 Ii )
i,J
Ii
bi + Ii bi
bi + Ii
i
(4.105)
for I J + 1 i I.
Example 4.31 (Poisson-gamma model)
We revisit the data set given in Example 2.7. For the a priori parameters we do the
same choices as in Example 4.19 (see Table 4.4). Since i is Gamma distributed
with shape parameter ai and scale parameter bi we have
ai
,
bi
1/2
Vco (i ) = ai ,
E [i ] =
(4.106)
(4.107)
(4.108)
Ii /i
,
bi + Ii /i
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
(4.109)
0
1
2
3
4
5
6
7
8
9
E[i ]
11653101
11367306
10962965
10616762
11044881
11480700
11413572
11126527
10986548
11618437
113
Vco(i )
5.00%
5.00%
5.00%
5.00%
5.00%
5.00%
5.00%
5.00%
5.00%
5.00%
Vco(Ci,J )
7.8%
7.8%
7.8%
7.8%
7.8%
7.8%
7.8%
7.8%
7.8%
7.8%
ai
400
400
400
400
400
400
400
400
400
400
bi
0.00343%
0.00352%
0.00365%
0.00377%
0.00362%
0.00348%
0.00350%
0.00360%
0.00364%
0.00344%
i
41951
40922
39467
38220
39762
41331
41089
40055
39552
41826
i,Ii =
The term
4.38).
E [Var (Xi,Ii |i )]
Ii Var(i )
Ii
=
Ii + i bi
Ii +
Ii
.
E [Var (Xi,Ii |i )]
(4.110)
Ii Var(i )
estimated reserves
post
0
1
2
3
4
5
6
7
8
9
Ci,Ii
i,Ii
i,Ii
11148124
10648192
10635751
9724068
9786916
9935753
9282022
8256211
7648729
5675568
100.0%
99.9%
99.8%
99.6%
99.1%
98.4%
97.0%
94.8%
88.0%
59.0%
41.0%
40.9%
40.9%
40.9%
40.8%
40.6%
40.3%
39.7%
37.9%
29.0%
ai,Ii
post
bi,Ii
11446143
11079028
10839802
10265794
10566741
10916902
10670762
10165120
10116206
11039755
P oiGa
d
C
i,J
11148124
10663907
10662446
9760401
9878219
10105034
9601115
8780696
8862913
10206452
PoiGa
CL
BF
0
15715
26695
36333
91303
169281
319093
524484
1214184
4530884
6927973
0
15126
26257
34538
85302
156494
286121
449167
1043242
3950815
6047061
0
16124
26998
37575
95434
178024
341305
574089
1318646
4768384
7356580
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
114
!2
J
X
=E
Xi,j (1 Ii ) E [i | DI ] DI
j=Ii+1
!2
J
X
DI (4.111)
=E
Xi,j j E [i | DI ]
j=Ii+1
(cf. (4.104)-(4.105)). Since for j > I i
E [Xi,j | DI ] = E E [Xi,j | i , DI ]| DI
= E E [Xi,j | i ]| DI
(4.112)
= j E [i | DI ] ,
we have that
d
msepCi,J |DI C
i,J
P oiGa
= Var
J
X
j=Ii+1
!
Xi,j DI .
(4.113)
j=Ii+1
= i (1 Ii ) E [ i | DI ] + (1 Ii )2 Var (i | DI ) .
With Lemma 4.28 this leads to the following corollary:
Corollary 4.32 Under Model Assumptions 4.26 the conditional mean square error
of prediction is given by
msepCi,J |DI
apost
apost
P oiGa
i,Ii
i,Ii
2
d
Ci,J
= i (1 Ii ) post + (1 Ii ) post 2
bi,Ii
bi,Ii
for I J + 1 i I.
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
(4.115)
115
Remark. Observe that we have assumed that the parameters ai , bi , i and j are
known. If these need to be estimated we obtain an additional term in the MSEP
calculation which corresponds to the parameter estimation error.
The unconditional mean square error of prediction can then easily be calculated.
We have
h
i
P oiGa
P oiGa
d
d
msepCi,J C
=
E
msep
C
(4.116)
i,J
i,J
Ci,J |DI
post
post
E ai,Ii
E ai,Ii
+ (1 Ii )2 post 2 ,
= i (1 Ii )
post
bi,Ii
bi,Ii
and using E[Ci,Ii ] = Ii
msepCi,J
ai
bi
P oiGa
ai
1 + i bi
d
Ci,J
= i (1 Ii )
.
bi i bi + Ii
(4.117)
Hence we obtain the Table 4.11 for the conditional prediction errors.
1/2
msepC
i,J |Ci,Ii
0
1
2
3
4
5
6
7
8
9
total
()
1/2
msepC
i,J
()
P oiGa
d
C
i,J
Go
d
C
i,J
P oiGa
d
C
i,J
Go
d
C
i,J
ci (c )
R
i
25367
32475
37292
60359
83912
115212
146500
224738
477318
571707
16391
21602
23714
37561
51584
68339
82516
129667
309586
359869
25695
32659
37924
61710
86052
119155
153272
234207
489668
588809
18832
23940
27804
45276
63200
87704
113195
174906
388179
457739
17527
22282
25879
42153
58862
81745
105626
163852
372199
435814
Table 4.11: Mean square errors of prediction in the Poisson-gamma model, the
Log-normal/Log-normal model and in Model 4.14
We have already seen in Table 4.10 that the Poisson-gamma reserves are closer to
the Bornhuetter-Ferguson estimate (this stands in contrast with the other methods
presented in this chapter). Table 4.11 shows that the prediction error is substantially larger in the Poisson-gamma model than in the other models (comparable to
ci (0) in Table 4.5). This suggests that in the present case
the estimation error in R
the Poisson-gamma method is not an appropriate method.
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
116
4.2.4
In the subsection above we have seen that in the Poisson-gamma model i has as a
posteriori distribution again a Gamma distribution with updated parameters. This
indicates, using a smart choice of the distributions we were able to calculate the a
posteriori distribution. We generalize the Poisson-gamma model to the Exponential
dispersion family (EDF), and we look for its associate conjugates. This are standard models in Bayesian inference, for literature we refer e.g. to Bernardo-Smith
[9]. Similar ideas have been applied for tariffication and pricing (see B
uhlmannGisler [18], Chapter 2), we transform these ideas to the reserving context (see also
W
uthrich [89]).
Model Assumptions 4.33 (Exponential dispersion model)
There exists a claims development pattern (j )0jJ with J = 1, 0 = 0 6= 0 and
j = j j1 6= 0 for j {1, . . . , J}.
Conditionally, given i , the Xi,j (0 i I, 0 j J) are independent
with
2
x i b(i )
Xi,j (d)
(i )
dFi,j (x) = a x,
exp
d(x), (4.118)
j i
wi,j
2 /wi,j
where is a suitable -finite measure on R, b() is some real-valued twicedifferentiable function of i and i > 0, 2 and wi,j > 0 are some real-valued
( )
constants, and Fi,j i is a probability distribution on R.
The random vectors i , (Xi,0 , . . . , Xi,J ) (i = 0, . . . , I) are independent and
i are real-valued random variables with densities (w.r.t. the Lebesgue measure)
b()
2
u, 2 () = d(, ) exp
,
(4.119)
2
with 1 and 2 > 0.
2
Remarks 4.34
In the following the measure is given by the Lebesgue measure or by the
counting measure.
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
117
wi,k
,
= 2 2 +
k=0
#
"
j
2
X
post,j
2
(j)
=
2 1+
wi,k Yi
,
2
k=0
"
2
post,j
(i)
post,j
(4.120)
(4.121)
where
(j)
Yi =
j
X
k=0
wi,k
Xi,k
.
Pj
k i
l=0 wi,l
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
(4.122)
118
Proof. Define Yi,j = Xi,j /(j i ). The joint distribution of (i , Yi,0 , . . . , Yi,j ) is
given by
fi ,Yi,0 ,...,Yi,j (, y0 , . . . , yj ) = fYi,0 ,...,Yi,j |i (y0 , . . . , yj |) u1, 2 ()
1 b()
2
= d(1, ) exp
(4.123)
2
j
Y
yk b()
2
exp
.
a yk ,
2 /w
w
i,k
i,k
k=0
Hence the conditional distribution of i , given Xi,0 , . . . , Xi,j , is proportional to
( "
#
"
#)
j
j
X
X
1
wi,k Xi,k
1
wi,k
exp 2 +
b() 2 +
.
(4.124)
2 k i
2
k=0
k=0
This finishes the proof of the lemma.
2
Remarks 4.36
Lemma 4.35 states that the distribution defined by density (4.119) is a conjugated distribution to the distribution given by (4.118). This means that
the a posteriori distribution of i , given Xi,0 , . . . , Xi,j , is again of the type
(i)
2
and post,j .
(4.119) with updated parameters post,j
From Lemma 4.35 we can calculate the distribution of (Yi,Ii+1 , . . . Yi,J ), given
DI . First we remark that different accident years are independent, hence we
can restrict ourselves to the observations Yi,0 , . . . , Yi,Ii , then we have that
the a posteriori distribution is given by
Z
J
Y
j=Ii+1
()
2
post,Ii ,post,Ii
() d.
(4.125)
119
i b() / 2 disappears on the boundary of i for all i , 2 then
E [Xi,j ] = j E [(i )] = j i ,
(j)
E [(i )| Xi,0 , . . . , Xi,j ] = i,j Y + (1 i,j ) 1,
i
(4.128)
(4.129)
where
Pj
wi,k
.
2
2
k=0 wi,k + /
k=0
i,j = Pj
(4.130)
(4.131)
we refer to B
uhlmann-Gisler [18].
Estimator 4.39 Under Model Assumptions 4.33 we have the following estimators
for the increments E [Xi,Ii+k | DI ] and the ultimate claims E [Ci,J | DI ]
X\
i,Ii+k
d
C
i,J
EDF
EDF
^
= Ii+k i (
i ),
(4.132)
^
= Ci,Ii + (1 Ii ) i (
i)
(4.133)
(4.134)
120
Theorem 4.40 (Bayesian estimator) Under Model Assumptions 4.33 the estiEDF
EDF
^
d
\
mators (
and C
are DI -measurable and minimize the coni ), Xi,j+k
i,J
ditional mean square errors msep(i )|DI (), msepXi,j+k |DI () and msepCi,J |DI (), respectively, for I J + 1 i I. I.e. these estimators are Bayesian w.r.t. DI and
minimize the quadratic loss function (L2 (P )-norm).
^
Proof. The DI -measurability is clear. But then the claim for (
i ) is clear,
since the conditional expectation minimizes the mean square error given DI (see
Theorem 2.5 in [18]). Due to our independence assumptions we have
^
E [Xi,Ii+k |DI ] = E [E [Xi,Ii+k |i ] |DI ] = Ii+k i (
i ),
^
E [Ci,J |DI ] = Ci,Ii + (1 Ii ) i (
i ),
(4.135)
(4.136)
(4.137)
,
with
E
b
(
)
= 1, (4.138)
=
Var
i
i
(1)
j i
wi,j
Var b0(1) (i ) = mb 2 .
(4.139)
Since both, 2 and 2 are multiplied by mb , the credibility weights i,j do not
change under this transformation. Hence we assume (4.137) for the rest of this
work.
In Section 4.2.4 we have not specified the weights wi,j . In Mack [47] there is a
discussion choosing appropriate weights (Assumption (A4 ) in Mack [47]). In fact
we could choose a design matrix i,j which gives a whole family of models. We do
not further discuss this here, we will do a canonical choice (which is favoured in
many applications) that has the nice side effect that we obtain a natural mixture
between the chain-ladder estimate and the Bornhuetter-Ferguson estimate.
Model Assumptions 4.41
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
121
In addition to Model Assumptions 4.33 and (4.137) we assume that there exists 0 with wi,j = j i for all i = 0, . . . , I and j = 0, . . . , J and that
exp 0 b() / 2 disappears on the boundary of i for all 0 and 2 .
2
Pj
Hence, we have k=0 wi,k = j i . This immediately implies:
Corollary 4.42 Under Model Assumptions 4.41 we have for all i = 0, . . . , I that
Ci,Ii
^
(
+ (1 i,Ii ) 1,
(4.140)
i ) = i,Ii
Ii i
Ii
where i,Ii =
.
(4.141)
2
Ii + 2
i
Remark. Compare the weight i,Ii from (4.141) to i,Ii from (4.110):
In the notation of Subsection 4.2.3 (see (4.110)) we have
E [Var (Xi,Ii | i )]
Ii Var(i )
and in the notation of this subsection we have
i
h
Xi,Ii
E Var
i i
2 /i
.
i =
=
2
Ii Var (i )
i = i bi =
(4.142)
(4.143)
P oiGa
EDF
d
d
This shows that the estimators C
and C
give the same estimated
i,J
i,J
reserve (the Poisson-gamma model is an example for the Exponential dispersion
family with associate conjugates).
j
i
j
i
j=0
j=0
=
=
J
X
j2 2i
j=0
2i
2
+ 2i Var b0 (i )
i,j
2 + 2i 2 .
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
(4.144)
122
2
+ 2.
(4.145)
Ii
Ii
=
.
2
Vco2 (Ci,J )
Ii + 2
1
Ii
2
i
(4.146)
5.00%
5.00%
5.00%
5.00%
5.00%
5.00%
5.00%
5.00%
5.00%
5.00%
6.00%
6.00%
6.00%
6.00%
6.00%
6.00%
6.00%
6.00%
6.00%
6.00%
1.4400
1.4400
1.4400
1.4400
1.4400
1.4400
1.4400
1.4400
1.4400
1.4400
i,Ii
41.0%
40.9%
40.9%
40.9%
40.8%
40.6%
40.3%
39.7%
37.9%
29.0%
^
(
i)
0.9822
0.9746
0.9888
0.9669
0.9567
0.9509
0.9349
0.9136
0.9208
0.9502
reserves EDF
0
15715
26695
36333
91303
169281
319093
524484
1214184
4530884
6927973
Table 4.12: Estimated reserves in the Exponential dispersion model with associate
conjugate
The estimates in Table 4.10 and Table 4.12 lead to the same result.
^
Moreover, we see that the Bayesian estimate (
i ) is below 1 for all accident years
i (see Table 4.12). This suggests (once more) that the choices of the a priori
estimates i for the ultimate claims were too conservative.
EDF
d
Conclusion 1. Corollary 4.42 implies that the estimator C
gives the optimal
i,J
mixture between the Bornhuetter-Ferguson and the chain-ladder estimates in the
EDF with associate conjugate: Assume that j and fj are identified by (4.3) and
bCL = Ci,Ii /Ii . Then we have that
set C
i,J
EDF
Ci,Ii
d
Ci,J
= Ci,Ii + (1 Ii ) i,Ii
+ (1 i,Ii ) i
Ii
h
i
CL
d
= Ci,Ii + (1 Ii ) i,Ii C
+
(1
i,J
i,Ii
i (4.147)
= Ci,Ii + (1 Ii ) Si (i,Ii ) ,
where Si () is the function defined in (4.1). Hence we have the mixture
EDF
CL
BF
d
d
d
C
= i,Ii C
+ (1 i,Ii ) C
i,J
i,J
i,J
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
(4.148)
123
between the CL estimate and the BF estimate. Moreover it minimizes the conditional MSEP in the Exponential dispersion model with associate conjugate. Observe that
Ii
,
(4.149)
i,Ii =
Ii + i
where the credibility coefficient was defined in (4.143). If we choose i = 0 we obtain the chain-ladder estimate and if we choose i = we obtain the BornhuetterFerguson reserve.
Conclusion 2. Using (4.135) we find for all I i j < J that
E [Ci,j+1 |Ci,0 , . . . , Ci,Ii ]
#
" j+1
X
Xi,l Ci,0 , . . . , Ci,Ii
= Ci,Ii + E
l=Ii+1
= Ci,Ii +
j+1
X
^
l i (
i)
(4.150)
l=Ii+1
=
j+1 Ii
1+
i,Ii Ci,Ii + (j+1 Ii ) (1 i,Ii ) i .
Ii
In the 2nd step we explicitly use, that we have an exact Bayesian estimator. (4.150)
does not hold true in the B
uhlmann-Straub model (see Section 4.3 below). Formula
(4.150) suggests that the EDF with associate conjugate is a linear mixture of
the chain-ladder model and the Bornhuetter-Ferguson model. If we choose the
credibility coefficient i = 0, we obtain
j+1 j
E [Ci,j+1 |Ci,0 , . . . , Ci,j ] = 1 +
Ci,j = fj Ci,j ,
(4.151)
j
if we assume (4.3). This is exactly the chain-ladder assumption (2.1). If we choose
i = then i,Ii = 0 and we
E [Ci,J |Ci,0 , . . . , Ci,Ii ] = Ci,Ii + (1 Ii ) i ,
(4.152)
which is Model 2.8 that we have used to motivate the Bornhuetter-Ferguson estiBF
d
mate C
.
i,J
Under Model Assumptions 4.41 we obtain for the conditional mean square error of
prediction
2
^
^
msep(i )|DI (
(
= Var((i )|DI ), (4.153)
i) = E
i ) (i ) DI
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
124
^
(i ) = E Var((i )|DI ) .
(4.154)
2
+ (1 i,Ii )2 2
Ii i
= (1 i,Ii ) 2 ,
2
J
X
^
(i ) +
E Var (Xi,k | i )
2i
(1 Ii ) msep(i )
2i
(1 Ii )2 (1 i,Ii ) 2 + (1 Ii ) 2 /i .
k=Ii+1
d
msepCi,J C
i,J
EDF
= (1 Ii ) i
2
i 2
2
ii + i
1+
(4.157)
This is the same value as for the Poisson-gamma case, see (4.116) and Table 4.11.
For the conditional mean square error of prediction for the estimate of Ci,J , one
needs to calculate
Var (i )|DI = Var b0 (i )|DI ,
(4.158)
where i , given DI , has a posteriori distribution u(i) , 2 () given by Lemma
post,j post,j
4.35. We omit its further calculation.
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
125
bj
(CL)
(CL)
= bj
bj1 .
(4.159)
At the current stage we can not say anything about the optimality of this estimator.
However, observe that for the Poisson-gamma model this estimator is natural in
the sense that it coincides with the MLE estimator provided in the Poisson model
(see Corollary 2.18). For more on this topic we refer to Subsection 4.2.5.
Estimation of i . Usually, one takes a plan value, a budget value or the value
used for the premium calculation (as in the BF method).
Estimation of 2 and 2 . For known j and i one can give unbiased estimators
for these variance parameters. For the moment we omit its formulation, because
in Section 4.3 we see that the Exponential dispersion model with its associate
conjugates satisfies the assumptions of the B
uhlmann-Straub model. Hence we can
take the same estimators as in the B
uhlmann-Straub model and these are provided
in Subsection 4.3.1.
4.2.5
In Model Assumptions 4.26 and 4.33 we have assumed that the claims development
pattern j is known. Of course, in general this is not the case and in practice one
usually uses estimate (4.159) for the claims development pattern. In Verrall [82]
this is called the plug-in estimate (which leads to the CL and BF mixture).
However, in a full Bayesian approach one should also estimate this paremater in a
Bayesian way (since usually it is not known). This means that we should also give
an a priori distribution to the claims development pattern. For simplicity, we only
treat the Poisson-gamma case (which was also considered in Verrall [82]). We have
the following assumptions
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
126
PJ
j=0
j = 1
conditionally, given and , the Xi,j are independent and Poisson distributed
with mean i j .
and are independent and i are independent Gamma distributed with
shape parameter ai and scale parameter bi , and is f distributed.
2
As before, we can calculate the joint distribution of {Xi,j , i + j I}, and ,
which is given by
Y
f ((xi,j )i+jI , , ) =
i j
i+jI
I
(i j )xi,j Y
fai ,bi (i ) f ().
xi,j !
i=0
(4.160)
(Ii)J
post
apost
i,Ii ,bIi
i=0
(i )
j i,j f (),
(4.161)
j=0
with
apost
i,j
= ai +
jJ
X
Xi,j
and
bpost
i,j
= bi +
k=0
jJ
X
k ,
(4.162)
k=0
see also Lemma 4.28. From this we immediately see, that one can not calculate
analytically the posterior distribution of (, ) given the observations DI , but this
also implies that we can not easily calculate the conditional distribution of Xk+l ,
k + l > I, given the observations DI . Hence these Bayesian models can only be
implemented with the help of numerical simulations, e.g. the Markov Chain Monte
Carlo (MCMC) approach. The implementation using a simulation-based MCMC
is discussed in de Alba [2, 4] and Scollnik [70].
4.3
B
uhlmann-Straub Credibility Model
In the last section we have seen an exact Bayesian approach to the claims reserving
problem. The Bayesian estimator
^
(
i ) = E [(i )| Xi,0 , . . . , Xi,Ii ]
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
(4.163)
127
is the best estimator for (i ) in the class of all estimators which are square integrable functions of the observations Xi,0 , . . . , Xi,Ii . The crucial point in the
calculation was that from the EDF with its associate conjugates we were able
to explicitly calculate the a posteriori distribution of (i ). Moreover, the parameters of the a posteriori distribution and the Bayesian estimator were linear in the
observations. However, in most of the Bayesian models we are not in the situation where we are able to calculate the a posteriori distribution, and therefore the
Bayesian estimator cannot be expressed in a closed analytical form. I.e. in general
the Bayesian estimator does not meet the practical requirements of simplicity and
intuitiveness and can only be calculated by numerical procedures such as Markov
Chain Monte Carlo methods (MCMC methods).
In cases where we are not able to derive the Bayesian estimator we restrict the class
of possible estimators to a smaller class, which are linear functions of the observations Xi,0 , . . . , Xi,Ii . This means that we try to get an estimator which minimizes
the quadratic loss function among all estimators which are linear combinations of
the observations Xi,0 , . . . , Xi,Ii . The result will be an estimator which is practicable and intuitive by definition. This approach is well-known in actuarial science as
credibility theory and since best is also to be understood in the Bayesian sense
credibility estimators are linear Bayes estimators (see B
uhlmann-Gisler [18]).
In claims reserving the credibility theory was used e.g. by De Vylder [84], Neuhaus
[56] and Mack [51] in the B
uhlmann-Straub context.
In the sequel we always assume that the incremental loss development pattern
(j )j=0,...,J given by
0 = 0
and j = j j1 for j = 1, . . . , J
(4.164)
(4.165)
(4.166)
128
(4.167)
Var (Ci,j |i ) = j 2 (i ).
(4.168)
The latter equation shows that this model is different from Model 4.14. The term
(1 j ) 2 (Ci,J ) is replaced by 2 (i ). On the other hand the B
uhlmann-Straub
model is very much in the spirit of the EDF with its associate conjugates. The
parameter i plays the role of the underlying risk characteristics, i.e. the parameter
i is unknown and tells us whether we have a good or bad accident year. For a
more detailed explanation in the framework of tariffication and pricing we refer to
B
uhlmann-Gisler [18].
\
In linear credibility theory one looks for an estimate (
i ) of (i ) which minimizes
the quadratic loss function among all estimators which are linear in the observations
Xi,j (see also [18], Definition 3.8). I.e. one has to solve the optimization problem
cred
\
(
i)
= argmineL(X,1) E (()
e)2 ,
(4.169)
where
I (Ii)J
X
X
L(X, 1) =
e;
e = ai,0 +
ai,j Xi,j
i=0
with ai,j R
(4.170)
j=0
Remarks 4.46
cred
\
Observe that the credibility estimator (
is linear in the observations
i)
Xi,j by definition. We could also allow for general real-valued, square integrable functions of the observations Xi,j . In that case we obtain simply the
Bayesian estimator since the conditional a posteriori expectation minimizes
the quadratic loss function among all estimators which are a square integrable
function of the observations.
Credibility estimators can also be constructed using Hilbert space theory.
Indeed (4.169) asks for a minimization in an L2 -sense, which corresponds to
orthogonal projections in Hilbert spaces. For more on this topic we refer to
B
uhlmann-Gisler [18].
We define the structural parameters
0 = E [(i )] ,
2 = E 2 (i ) ,
(4.171)
2 = Var ((i )) .
(4.173)
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
(4.172)
129
\
(
i)
= i Yi + (1 i ) 0
(4.174)
for I J + 1 i I, where
i =
Ii
,
Ii + 2 / 2
(Ii)J
Yi =
j=0
(4.175)
Ci,(Ii)J
j Xi,j
=
.
Ii j
Ii
(4.176)
In credibility theory the a priori mean 0 can also be estimated from the data.
This leads to the homogeneous credibility estimator.
Theorem 4.48 (homogeneous B
uhlmann-Straub estimator)
Under Model Assumptions 4.45 the optimal linear homogeneous estimator of (i )
given the observations DI is given by
hom
\
(
i)
= i Yi + (1 i )
b0
(4.177)
b0 =
i=0
with
I
X
i .
(4.178)
i=0
Proof of Theorem 4.47 and Theorem 4.48. We refer to Theorems 4.2 and 4.4
in B
uhlmann-Gisler [18].
2
Remarks 4.49
If the a priori mean 0 is known we choose the inhomogeneous credibility esticred
\
mator (
from Theorem 4.47. This estimator minimizes the quadratic
i)
loss function given in (4.169) among all estimators given in (4.170).
If the a priori mean 0 is unknown, we estimate its value also from the data.
hom
\
This is done by switching to the homogeneous credibility estimator (
i)
given in Theorem 4.48. The crucial part is that we have to slightly change
the set of possible estimators given in (4.170) towards
I (Ii)J
X
X
Le (X) =
e;
e=
ai,j Xi,j with ai,j R, E [e
] = 0 .
i=0
j=0
(4.179)
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
130
The crucial point in the credibility estimators in (4.174) and (4.177) is that
we take a weighted average Yi between the individual observations of accident
year i and the a priori mean 0 and its estimator
b0 , respectively. Observe
that the weighted average Yi only depends on the observations of accident
year i. This is a consequence of the independence assumption between the
accident years. However, the estimator
b0 uses the observations of all accident
years since the a priori mean 0 holds for all accident years. The credibility
weight i [0, 1] for the weighted average of the individual observations Yi
becomes small when the expected fluctuations within the accidents years 2
are large and becomes large if the fluctuations between the accident years 2
are large.
The estimator (4.174) is exactly the same as the one from the exponential
dispersion model with associate conjugates (Corollary 4.42) if we assume that
all a priori means i are equal and = 0.
cred
\
Since the inhomogeneous estimator (
contains a constant it is aui)
tomatically an unbiased estimator for the a priori mean 0 . In contrast to
cred
hom
\
\
(
the homogeneous (
is unbiased for 0 by definition.
i)
i)
The weights j in the model assumptions could be replaced by weights i,j ,
then the B
uhlmann-Straub result still holds true. Indeed, one could choose a
design matrix i,j = i (j) to apply the B
uhlmann-Straub model (see Taylor
[75] and Mack [47]) and the variance condition is then replaced by
Var (Xi,j /j,i |i ) =
2 (i )
,
Vi j,i
(4.181)
131
\
(
i)
= i Yi + (1 i ) 1,
(4.184)
= i Yi + (1 i )
b0 ,
(4.185)
and
hom
\
(
i)
respectively, where
Yi =
Ci,IiJ
,
i Ii
i =
Ii
Ii + i
with i =
2
.
i 2
(4.186)
Observe that this gives now completely the same estimator as in the exponential
dispersion family with its associate conjugates (see Corollary 4.42).
This immediately gives the following estimators:
Estimator 4.50 (B
uhlmann-Straub credibility reserving estimator)
In the B
uhlmann-Straub model 4.45 with generalized assumptions (4.182)-(4.183)
we have the following estimators
cred
cred
\
d
b [Ci,J | DI ] = Ci,Ii + (1 Ii ) i (
C
= E
i,J
i)
d
C
i,J
hom
(4.187)
hom
\
b [Ci,J | DI ] = Ci,Ii + (1 Ii ) i (
= E
i)
(4.188)
for I J + 1 i I.
Lemma 4.51 In the B
uhlmann-Straub model 4.45 the quadratic losses for the credibility estimators are given by
"
2 #
cred
\
E
(
(i )
= 2 (1 i ) ,
(4.189)
i)
"
E
hom
\
(
i)
2 #
1 i
2
(i )
= (1 i ) 1 +
(4.190)
for I J + 1 i I.
Proof. We refer to Theorems 4.3 and 4.6 in B
uhlmann-Gisler [18].
2
Corollary 4.52 In the B
uhlmann-Straub model 4.45 with generalized assumptions
(4.182)-(4.183) the mean square errors of prediction of the inhomogeneous and homogeneous credibility reserving estimator are given by
cred
d
msepCi,J Ci,J
= 2i (1 Ii ) 2 /i + (1 Ii )2 2 (1 i ) . (4.191)
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
132
and
hom
cred
(1 i )2
d
d
, (4.192)
msepCi,J C
= msepCi,J C
+ 2i (1 Ii )2 2
i,J
i,J
respectively, for I J + 1 i I.
Remarks 4.53
The first term on the right-hand side of the above equalities stands again for
the process error whereas the second terms stand for the parameter/prediction
errors (how good can an actuary predict the mean). Observe again, that we
assume that the incremental loss development pattern (j )j=0,...,J is known,
and hence we do not estimate the estimation error in the claims development
pattern.
Observe the MSEP formula for the credibility estimator conicides with the
one for the exponential dispersion family, see (4.156).
Proof. We separate the mean square error of prediction as follows
"
2 #
cred
cred
\
d
msepC
C
=E
(1 Ii ) i (
(Ci,J Ci,Ii )
.
i,J
i)
i,J
(4.193)
Conditionally, given = (0 , . . . , I ), we have that the increments Xi,j are independent. But this immediately implies that the expression in (4.193) is equal
to
" "
2 ##
cred
\
(4.194)
E E (1 Ii )2 2i (
(i )
i)
h h
2 ii
+ E E (1 Ii ) i (i ) (Ci,J Ci,Ii )
cred
2
2
\
+ E Var (Ci,J Ci,Ii | ) .
= (1 Ii ) i msep(i ) (i )
But then the claim follows from Lemma 4.51 and
Var (Ci,J Ci,Ii | ) = (1 Ii ) 2
2 (i ).
i
(4.195)
2
4.3.1
Parameter estimation
So far (in the example) the choice of the variance parameters was rather artificial.
In this subsection we provide estimators for 2 and 2 . In practical applications it
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
133
is often convenient to eliminate outliers for the estimation of 2 and 2 , since the
estimators are often not very robust.
Before we start with the parameter estimations we would like to mention that in
this section essentially the same remarks apply as the ones mentioned on page 125.
We need to estimate j , 2 and 2 . For the weights j we proceed as in (4.159)
Estimate the claims development pattern j from (2.25). The incremental losse
development pattern j is then estimated by (4.164).
We define
2
(Ii)J
X
Xi,j
1
Yi .
j
(4.196)
Si =
(I i) J j=0
j
Then Si is an unbiased estimator for 2 (see [18], (4.22)). Hence 2 is estimated
by the following unbiased estimator
I1
1X
Si .
b2 =
I i=0
(4.197)
I
X
i=0
where
2
P Ii Yi Y ,
i Ii
P
P
Ii Yi
i Ci,(Ii)J
i
Y = P
= P
.
i Ii
i Ii
b2 = c T P
,
i Ii
with
c=
I
X
i=0
!1
Ii
Ii
P
1 P
.
i Ii
i Ii
(4.198)
(4.199)
(4.200)
(4.201)
0
5946975
6346756
6269090
5863015
5778885
6184793
5600184
5288066
5290793
5675568
1.4925
2
10563929
10316383
10092366
9268771
9178009
9585897
9056505
8256211
1.0229
1
9668212
9593162
9245313
8546239
8524114
9013132
8493391
7728169
7648729
1.0778
1.0148
3
10771690
10468180
10355134
9459424
9451404
9830796
9282022
1.0070
4
10978394
10536004
10507837
9592399
9681692
9935753
1.0051
5
11040518
10572608
10573282
9680740
9786916
1.0011
6
11106331
10625360
10626827
9724068
1.0010
7
11121181
10636546
10635751
1.0014
8
11132310
10648192
9
11148124
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
1
10086719
10764791
10633061
9944313
9801620
10490085
9498524
8969136
8973762
9626383
59.0%
3
13090078
10569215
12378890
10559131
9556060
8370426
8229388
7716853
6.8%
2
12814544
11179404
10248997
9240019
9453540
9739738
9963120
8402801
8119848
29.0%
2.2%
4
9577303
6997504
12113052
8788679
12602942
11289335
10395847
1.4%
5
14357308
4710946
10606498
9236259
15995382
7290128
0.7%
6
9048371
5331290
9531934
12866767
15325912
0.5%
7
12901245
10340861
10496344
8493606
0.1%
8
13793367
10390677
8289406
0.1%
9
10658637
11152804
0.1%
10
11148124
Table 4.14: Observed scaled incremental payments Xi,j /j and estimated incremental claims development pattern
bj
0
1
2
3
4
5
6
7
8
9
j
Table 4.13: Observed historical cumulative payments Ci,j and estimated chain-ladder factors fbj , see Table 2.2
0
1
2
3
4
5
6
7
8
9
fbj
134
Chapter 4. Bayesian models
135
c2
(4.202)
= 90 9110 975,
(4.203)
b = 3370 289,
(4.204)
b = 7340 887,
(4.205)
b0 = 90 8850 584.
d (i ) =
= 21.1%, Vco
1/2
c2 +c2
c0
(4.206)
b
c0
d i,J ) =
= 7.4% and Vco(C
estimated reserves
0
1
2
3
4
5
6
7
8
9
i
82.6%
82.6%
82.6%
82.5%
82.5%
82.4%
82.2%
81.8%
80.7%
73.7%
cred
d
C
i,J
11148124
10663125
10661675
9758685
9872238
10091682
9569836
8716445
8719642
9654386
CL
0
15126
26257
34538
85302
156494
286121
449167
1043242
3950815
6047061
hom. cred.
0
14934
25924
34616
85322
155929
287814
460234
1070913
3978818
6114503
136
msepC
i,J
cred
d
C
i,J
0
1
2
3
4
5
6
7
8
9
total
1/2
msepC
i,J
hom
d
C
i,J
0
12711
16755
20095
31465
42272
59060
78301
123114
265775
314699
0
12711
16755
20096
31467
42278
59076
78339
123259
267229
315998
0
1
2
3
4
5
6
7
8
9
total
c0
credibility weights i
=0
=1
=2
80.2%
80.6%
81.1%
80.1%
80.2%
80.3%
80.1%
79.6%
79.1%
80.1%
79.1%
78.0%
80.0%
79.7%
79.3%
79.9%
80.2%
80.4%
79.7%
79.8%
80.0%
79.3%
79.0%
78.8%
78.1%
77.6%
77.0%
70.4%
71.0%
71.5%
0.8810
0.8809
reserves
CL
0
15126
26257
34538
85302
156494
286121
449167
1043242
3950815
6047061
credibility reserves
=0
=1
=2
0
0
0
14943
14944
14944
25766
25753
25740
34253
34238
34222
85056
85051
85046
156562
156561
156559
289078
289056
289035
460871
461021
461180
1069227 1069815 1070427
4024687 4023270 4021903
6160443 6159709 6159056
0.8809
This describes the accuracy of the estimate of the true expected mean by the
actuary. Observe that we have chosen 5% in Example 4.54.
Moreover, we see (once more) that the a priori estimate i seems to be rather
pessimistic, since b0 is substantially smaller than 1 (for all ).
For the mean square error of prediction we obtain the values in Table 4.18.
4.4
In Section 4.3 we have assumed that the incremental payments have the following
form
E [Xi,j |i ] = j (i ).
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
(4.208)
137
1/2
msepC
i,J
1
2
3
4
5
6
7
8
9
total
=0
12835
16317
18952
30871
43110
59876
77383
120119
273931
320377
hom
d
C
i,J
=1
12771
16532
19511
31161
42682
59456
77819
121536
269926
317540
=2
12711
16755
20094
31464
42272
59059
78282
123008
266054
314889
The constant j denotes the payment ratio in period j. If we rewrite this in vector
form we obtain
E [Xi |i ] = (i ),
(4.209)
4.4.1
(4.210)
(4.211)
138
(4.212)
(4.213)
T = Cov (), () ,
(4.214)
Si = (Sj,k,i )j,k=0, ,J
(4.215)
(4.216)
[Ii] 0 1 [Ii]
i
Si i
1 1
Ai = T T +
,
1
[Ii] 0 1 [Ii]
[Ii] 0 1
[Ii]
Bi = i
Si i
i
Si Xi ,
(4.218)
(4.219)
where
[Ii]
[Ii]
Xi
0
= 0 (i), . . . , (Ii)J (i), 0, . . . , 0
(4.220)
(4.221)
139
J
X
cred
\
j (i)0 (
i)
(4.225)
j=Ii+1
for I J + 1 i I with p I i + 1.
Remarks 4.61
If is not known, then (4.217) can be replaced by the homogeneous credibility
estimator for (i ) using
!1 I
I
X
X
b=
Ai Bi .
(4.226)
Ai
i=0
i=0
!1
I
X
(1 Ai ) T 1 +
A0i
(1 A0i ) .
(4.227)
i=0
Term (4.219) gives the formula for the data compression (see also Theorem
8.6 in B
uhlmann-Gisler [18]). We already see from this that for p > 1 we have
some difficulties with considering the youngest years since the dimension of
is larger than the available number of observations if p > I i + 1. Observe
that
E [Bi | i ] = (i ),
(4.228)
h
i
1
0
[Ii] 0 1 [Ii]
E Bi (i ) Bi (i ) = i
Si i
. (4.229)
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
140
(4.230)
Parameter estimation. It is rather difficult to get good parameter estimations in this model for p > 1. If we assume that the covariance matrix
j,k,i (i ) j,k=0,...,J is diagonal with mean Si given by (4.216), we can estimate Si with the help of the one-dimensional B
uhlmann-Straub model (see
Subsection 4.3.1). An unbiased estimator for the covariance matrix T is given
by
Tb =
Ip
Ip
1
X
0 i
1
1 X h
[Ii] 0 1 [Ii]
Si i
,
i
E Bi B Bi B
I p i=0
I p + 1 i=0
(4.231)
with
Ip
X
1
B=
Bi .
I p + 1 i=0
(4.232)
4.4.2
In the B
uhlmann-Straub credibility model we had a deterministic cashflow pattern
j and we have estimated the exposure (i ) of the accident years. We could also
exchange the role of these two parameters
Model Assumptions 4.62
There exist scalars i (i = 0, . . . , I) such that
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
141
(4.233)
(4.234)
(4.235)
(4.236)
The difficulty in this model is that we have observations Xi,0 , . . . , Xi,Ii for
0 (i ), . . . , Ii (i ) and we need to estimate Ii+1 (i ), . . . , J (i ). This is
slightly different from classical one-dimensional credibility applications. From
this it is clear that a crucial role is played by the covariance structures, which
projects past observations to the future.
For general covariance structures it is difficult to give nice formulas. Special
cases were studied by Jewell [38] and Hesselager-Witting [35]. HesselagerWitting [35] assume that the vectors
(0 (i ), . . . , J (i ))
are i.i.d. Dirichlet distributed with parameters a0 , . . . , aJ . Define a =
then we have (see Hesselager-Witting [35], formula (3))
(4.237)
PJ
j=0
aj
E j (i ) = j = aj /a,
(4.238)
1
(1j=k j j k ) . (4.239)
Cov j (i ), k (i ) = Tj,k =
1+a
If we then choose a specific form for the covariance structure j,k,i (i ) we
can work out a credibility formula for the expected ultimate claim.
Of course there is a large variety of other credibility models, such as e.g. hierarchical
credibility models Hesselager [36]. We do not further discuss them here.
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
142
4.5
Kalman filter
Kalman filters are an enhancement of credibility models. We will treat only the
one-dimensional case, since already in the multivariate credibility context we have
seen that it becomes difficult to go to higher dimensions.
Kalman filters are evolutionary credibility models. If we take e.g. the B
uhlmannStraub model then it is assumed that i (i = 0, . . . , I) are independent and identically distributed (see Model 4.45). If we go back to Example 4.54 we obtain the
following picture for the observations Y0 , . . . , YI and the estimate
b0 for the a priori
mean 0 (cf. (4.176) and (4.178), respectively):
12'000'000
10'000'000
8'000'000
6'000'000
4'000'000
2'000'000
0
0
5
Y_i
10
mu_0 hat
(4.240)
(4.241)
143
(i ) i0 is a martingale.
2
Remarks 4.65
The assumption (4.241) can be relaxed in the sense that we only need in the
average (over ) conditional uncorrelatedness. Assumption (4.241) implies
that we obtain an updating procedure which is recursive.
The martingale assumption implies that we have uncorrelated centered increments (i+1 ) (i ) (see also (1.25)),
E [(i+1 )| (0 ), . . . , (i )] = (i ).
(4.242)
In Hilbert space language this reads as follows: The projection of (i+1 ) onto
the subspace of all square integrable functions of (0 ), . . . , (i ) is simply
(i ), i.e. the process (i ) i0 has centered orthogonal increments. This
last assumption could be generalized to linear transformations (see Corollary
9.5 in B
uhlmann-Gisler [18]).
We introduce the following notations (the notation is motivated by the usual terminology from state space models, see e.g. Abraham-Ledolter [1]):
Yi = (Xi,0 /0 , . . . , Xi,Ii /Ii ),
h
2 i
i|i1 = argmineL(Y0 ,...,Yi1 ,1) E (i )
e ,
h
2 i
i|i = argmineL(Y0 ,...,Yi ,1) E (i )
e
(4.243)
(4.244)
(4.245)
(cf. (4.170)). i|i1 is the best linear forecast for (i ) based on the information
Y0 , . . . , Yi1 . Whereas i|i is the best linear forecast for (i ) which is also based
on Yi . Hence there are two updating procedures: 1) updating from i|i1 to i|i
on the basis of the newest observation Yi and 2) updating from i|i to i+1|i due
to the parameter movement from (i ) to (i+1 ).
We define the following structural parameters
2 = E 2 (i ) ,
(4.246)
(4.247)
i2 = Var (i ) (i1 ) ,
h
2 i
qi|i1 = E i,i1 (i ) ,
(4.248)
h
i
2
qi|i = E i,i (i ) .
(4.249)
Theorem 4.66 (Kalman filter recursion formula, Theorem 9.6 in [18])
Under Model Assumptions 4.64 we have
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
144
1. Anchoring (i = 0)
0|1 = 0 = E [(0 )]
q0|1 = 02 = Var (0 ) .
and
(4.250)
2. Recursion (i 0)
(a) Observation update:
i|i = i Yi + (1 i ) i|i1 ,
(4.251)
qi|i = (1 i ) qi|i1 ,
(4.252)
(4.253)
with
i =
Ii
,
Ii + 2 /qi|i1
(Ii)J
Yi =
X
j=0
Ci,(Ii)J
j Xi,j
=
.
Ii j
Ii
(4.254)
(4.255)
and
2
qi+1|i = qi|i + i+1
.
(4.256)
(4.257)
for I J + 1 i I.
Remarks 4.68
In practice we face two difficulties: 1) We need to estimate all the parameters. 2) We need good estimates for the starting values 0 and 02 for the
iteration.
Parameter estimation: For the estimation of 2 we choose b2 as in the
B
uhlmann-Straub model (see (4.197)). The estimation of i2 is less straightforward, in fact we need to define a special case of the Model Assumptions
4.64.
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
145
(4.258)
for all i 1.
(0 ) and i are independent for all i 1.
2
Remark. In this model holds i = Var (i ) (i1 ) = Var(i ) = 2 .
Let us first calculate the variances and covariances of Yi defined in (4.255).
Var(Yi ) = Var E [Yi | ] + E Var (Yi | )
(Ii)J
2
X j
Xi,j
Var
= Var (i ) + E
2
j
Ii
j=0
1
= Var (0 ) + i 2 +
2.
Ii
(4.259)
= Var (0 ) + l 2 .
P
We define Y as in (4.199) with = Ii=0 Ii . Hence
!
PI
I
I
h
X
X
2 i
Ii
Ii
i=0 i Yi
=
E Yi Y
Var Yi
i=0
i=0
=
I
X
Ii
i=0
(4.260)
I
X
Ik Il
Var(Yi )
Cov(Yk , Yl )
2
k,l=0
I
X
(I + 1) 2
Ii X
Ik Ii
=
+ 2
i
min{i, k}
2
i=0
k=0
I
i1
XX
Ik Ii
(I + 1) 2
=
+ 2
(i k)
.
2
i=0 k=0
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
(4.261)
146
This motivates the following unbiased estimator for 2 (see also (4.198)):
!1
!
I
I X
i1
X
X
2 (I + 1) 2
Ii
Ik
Ii
Yi Y
b2 =
(i k)
2
i=0
i=0 k=0
(I + 1) 2
= c T
,
(4.262)
with
c =
I
X
max{i k, 0}
i,k=0
Ii Ik
2
!1
.
(4.263)
(4.264)
= 90 9110 975,
(4.265)
b = 3370 289,
b = 5450 637.
(4.266)
(4.267)
b0 = 90 8850 584
and
b0 = b = 5450 637
(4.268)
1/2
i|i1
qi|i1
Yi
i|i
9885584
10799066
10694625
10669454
9966628
9893857
10046550
9679468
8935539
8752681
545637
616466
620781
621064
621138
621423
621820
622655
623962
628309
72.4%
76.9%
77.2%
77.2%
77.1%
77.0%
76.7%
76.4%
75.1%
67.2%
11148123
10663316
10662005
9758602
9872213
10092241
9568136
8705370
8691961
9626366
10799066
10694625
10669454
9966628
9893857
10046550
9679468
8935539
8752681
9339528
1/2
i+1|i
qi+1|i
286899
296057
296651
296805
297401
298230
299967
302670
311533
360009
10799066
10694625
10669454
9966628
9893857
10046550
9679468
8935539
8752681
9339528
616466
620781
621064
621138
621423
621820
622655
623962
628309
653702
qi|i
1/2
147
12'000'000
10'000'000
8'000'000
6'000'000
4'000'000
2'000'000
0
0
Y_i
mu_0 hat
10
mu_i|i-1
estimated reserves
0
1
2
3
4
5
6
7
8
9
total
Ka
d
C
i,J
11148123
10663360
10662023
9759339
9872400
10091532
9571465
8717246
8699249
9508643
CL
0
15126
26257
34538
85302
156494
286121
449167
1043242
3950815
6047061
hom. cred.
0
14934
25924
34616
85322
155929
287814
460234
1070913
3978818
6114503
Kalman
0
15170
26275
35274
85489
155785
289450
461042
1050529
3833085
5952100
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
148
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
Chapter 5
Outlook
Several topics on stochastic claims reserving methods need to be added to the
current version of this manuscript: e.g.
explicit distributional models and methods, such as the Log-normal model or
Tweedies compound Poisson model
generalized linear model methods
bootstrapping methods
multivariate methods
Munich chain-ladder method
etc.
149
150
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
Chapter 5. Outlook
Appendix A
Unallocated loss adjustment
expenses
A.1
Motivation
In this section we describe the New York-method for the estimation of unallocated loss adjustment expenses (ULAE). The New York-method for estimating
ULAE is, unfortunately, only poorly documented in the literature (e.g. as footnotes
in Feldblum [26] and Foundation CAS [19]).
In non-life insurance there are usually two different kinds of claims handling costs,
external ones and internal ones. External costs like costs for external lawyers or for
an external expertise etc. are usually allocated to single claims and are therefore
contained in the usual claims payments and loss development figures. These payments are called allocated loss adjustment expenses (ALAE). Typically, internal
loss adjustment expenses (income of claims handling department, maintenance of
claims handling system, etc.) are not contained in the claims figures and therefore
have to be estimated separately. These internal costs can usually not be allocated to single claims. We call these costs unallocated loss adjustment expenses
(ULAE). From a regulatory point of view, we should also build reserves for these
costs/expenses because they are part of the claims handling process which guarantees that an insurance company is able to meet all its obligations. I.e. ULAE
reserves should guarantee the smooth run off of the old insurance liabilities without pay-as-you-go from new business/premium for the internal claims handling
processes.
151
152
A.2
Usually, claims development figures only consist of pure claims payments not
containing ULAE charges. They are usually studied in loss development triangles
or trapezoids as above (see Section 1.3).
(pure)
(pure)
Ci,j
j
X
Xi,k (pure) .
(A.1)
k=0
(pure)
For the New York-method we also need a second type of development trapezoids,
(pure)
denotes the pure cunamely a reporting trapezoid: For accident year i, Zi,j
mulative ultimate claim amount for all those
claims, whichare reported up to (and
(pure)
(pure)
(pure)
(pure)
including) development year j. Hence Zi,0 , Zi,1 , . . . with Zi,J
= Ci,J
(pure)
A.3
153
ULAE charges
The cumulative ULAE payments for accident year i until development period j are
(U LAE)
denoted by Ci,j
. And finally, the total cumulative payments (pure and ULAE)
are denoted by
(pure)
(U LAE)
Ci,j = Ci,j
+ Ci,j
.
(A.2)
(U LAE)
Xi,j
(U LAE)
= Ci,j
(U LAE)
Ci,j1
(A.3)
need to be estimated: The main difficulty is that for each accounting year t I
we usually have only one aggregated observation
X (U LAE)
(U LAE)
=
(sum over t-diagonal).
(A.4)
Xt
Xi,j
i+j=t
0jJ
I.e. ULAE payments are usually not available for single accident years but rather
we have a position Total ULAE Expenses for each accounting year t (in general
ULAE charges are contained in the position Administrative Expenses in the
annual profit-and-loss statement).
Hence, for the estimation of future ULAE payments we need first to define an
(U LAE)
appropriate model in order to split the aggregated observations Xt
into the
(U LAE)
different accident years Xi,j
.
A.4
New York-method
The New York-method assumes that one part of the ULAE charge is proportional
to the claims registration (denote this proportion by r [0, 1]) and the other part
is proportional to the settlement (payments) of the claims (proportion 1 r).
Assumption A.1 We assume that there are two development patterns (j )j=0,...,J
P
P
and (j )j=0,...,J with j 0, j 0, for all j, and Jj=0 j = Jj=0 j = 1 such that
(cashflow or payout pattern)
(pure)
Xi,j
(pure)
= j Ci,J
(A.5)
j
X
(pure)
l Ci,J
l=0
(A.6)
154
Remarks:
(pure)
bjCL =
fbj fbJ1
fbj1
The estimation of the claims reporting pattern j in (A.6) is more delicate.
As we have seen there are not many claims reserving methods which give a
reporting pattern j . Such a pattern can only be obtained if one separates
the claims estimates for reported claims and IBNyR claims (incurred but not
yet reported).
Model Assumptions A.2 Assume that there exists r [0, 1] such that the incremental ULAE payments satisfy for all i and all j
(U LAE)
(U LAE)
= r j + (1 r) j Ci,J
Xi,j
.
(A.8)
Henceforth, we assume that one part (r) of the ULAE charge is proportional to
the reporting pattern (one has loss adjustment expenses at the registration of the
claim), and the other part (1 r) of the ULAE charge is proportional to the claims
settlement (measured by the payout pattern).
Definition A.3 (Paid-to-paid ratio) We define for all t
P
(U LAE)
Xi,j
i+j=t
(U LAE)
X
0jJ
.
t = t (pure) = P
(pure)
Xt
Xi,j
(A.9)
i+j=t
0jJ
The paid-to-paid ratio measures the ULAE payments relative to the pure claim
payments in each accounting year t.
Lemma A.4 Assume there exists > 0 such that for all accident years i we have
(U LAE)
Ci,J
(pure)
= .
(A.10)
Ci,J
Under Assumption A.1 and Model A.2 we have for all accounting years t
t = ,
(pure)
whenever Ci,J
is constant in i.
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
(A.11)
J
P
i+j=t
t =
0jJ
(pure)
Xi,j
j=0
(U LAE)
r j + (1 r) j Ctj,J
J
P
i+j=t
j=0
0jJ
J
P
155
(pure)
j Ctj,J
(pure)
r j + (1 r) j Ctj,J
j=0
J
P
= .
j
j=0
(A.12)
(pure)
Ctj,J
Ri,j
(pure)
Xi,l
(IBN yR)
l>j
l>j
Ri,j
(pure)
l Ci,J
(pure)
l Ci,J
l>j
(rep)
Ri,j
(pure)
Ri,j
(IBN yR)
Ri,j
= Ri,j
(rep)
+ (1 r) Ri,j .
(A.13)
Ri,j
(U LAE)
r l + (1 r) l Ci,J
(A.14)
l>j
(pure)
r l + (1 r) l Ci,J
l>j
(IBN yR)
= r Ri,j
(pure)
+ (1 r) Ri,j
Remarks:
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
156
R
+
(1
r)
Ri,j , (A.15)
i,j
i,j
i+j=t
i+j=t
i+j=t
i.e. all we need to know is, how to split of total pure claims reserves into
reserves for IBNyR claims and reserves for reported claims.
The assumptions for the New York-method are rather restrictive in the sense
(pure)
that the pure cumulative ultimate claim Ci,J
must be constant in k (see
Lemma A.4). Otherwise the paid-to-paid ratio t for accounting years is
(U LAE)
(pure)
not the same as the ratio Ci,J
/Ci,J
even if the latter is assumed to
be constant. Of course in practice the assumption of equal pure cumulative
ultimate claim is never fulfilled. If we relax this condition we obtain the
following lemma.
Lemma A.6 Assume there exists > 0 such that for all accident years i we have
1
(U LAE)
Ci,J
= r + (1 r)
,
(A.16)
(pure)
Ci,J
with
(pure)
j=0 j Ctj,J
PJ
(pure)
j=0 Ctj,J
PJ
=
(pure)
j=0 j Ctj,J
.
PJ
(pure)
C
j=0 tj,J
PJ
=
and
(A.17)
Under Assumption A.1 and Model A.2 we have for all accounting years t
t = .
Proof of Lemma A.6. As in Lemma A.4 we obtain
J
(pure)
P
r j + (1 r) j Ctj,J
1
j=0
t = r + (1 r)
= .
J
P
(pure)
j Ctj,J
j=0
(A.18)
(A.19)
157
2
Remarks:
If all pure cumulative ultimates are equal then = = 1/(J + 1) (apply
Lemma A.4).
Assume that there exists a constant i(p) > 0 such that for all i 0 we
(pure)
(pure)
have Ci+1,J = (1 + i(p) ) Ci,J , i.e. constant growth i(p) . If we blindly apply
(A.11) of Lemma A.4 (i.e. we do not apply the correction factor in (A.16)) and
estimate the incremental ULAE payments by (A.13) and (A.15) we obtain
X
b (U LAE) =
X
i,j
i+j=t
J
X
(pure)
r j + (1 r) j Ctj,J
j=0
(U LAE)
Xt
(pure)
Xt
J
X
(pure)
r j + (1 r) j Ctj,J
(A.20)
j=0
=
r + (1 r)
i+j=t
!
PJ
(p) Jj
X (U LAE)
1
+
i
j
j=0
r PJ
=
Xi,j
+ (1 r)
(p) )Jj
(1
+
i
j
i+j=t
j=0
X (U LAE)
>
Xi,j
,
X
(U LAE)
Xi,j
i+j=t
where the last inequality in general holds true for i(p) > 0, since usually (j )j
is more concentrated than (j )j , i.e. we usually have J > 1 and
j
X
l=0
l >
j
X
for j = 0, . . . , J 1.
(A.21)
l=0
This comes from the fact that the claims are reported before they are paid.
I.e. if we blindly apply the New York-method for constant positive growth
then the ULAE reserves are too high (for constant negative growth we obtain
the opposite sign). This implies that we have always a positive loss experience
on ULAE reserves for constant positive growth.
A.5
Example
We assume that the observations for t are generated by i.i.d. random variables
(U LAE)
Xt
(pure) . Hence we can estimate from this sequence. Assume = 10%. Moreover
Xt
i(p) = 0 and set r = 50% (this is the usual choice, also done in the SST [73]).
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
158
Moreover we assume that we have the following reporting and cash flow patterns
(J = 4):
(0 , . . . , 4 ) = (90%, 10%, 0%, 0%, 0%),
(A.22)
(A.23)
(pure)
i,4
(A.25)
b (U LAE) we have
bi,j = X (pure) + X
Hence for the total estimated payments X
i,j
i,j
bi,0 , . . . , X
bi,4 = (360, 215, 210, 210, 105).
X
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
(A.26)
Appendix B
Distributions
B.1
B.1.1
Discrete distributions
Binomial distribution
(B.1)
Var(X)
np
n p (1 p)
Vco(X)
q
1p
np
Table B.1: Expectation, variance and variational coefficient of a Bin(n, p)distributed random variable X
B.1.2
Poisson distribution
x
x!
(B.2)
160
Appendix B. Distributions
E(X) Var(X) Vco(X)
Table B.2: Expectation, variance and variational coefficient of a Poisson()distributed random variable X
B.1.3
For r (0, ) and p (0, 1) the Negative binomial distribution NB(r, p) is defined
to be the discrete distribution with probability function
r+x1
fr,p (x) =
pr (1 p)x
(B.3)
x
for all x N0 .
For R and n N0 , the generalized binomial coefficient is defined to be
n
( 1) . . . ( n + 1) Y k + 1
=
.
(B.4)
=
n!
k
n
k=1
E(X)
r
1p
p
Var(X)
r
1p
p2
Vko(X)
1
r(1p)
Table B.3: Expectation, variance and variational coefficient of a NB(r, p)distributed random variable X
B.2
B.2.1
Continuous distributions
Normal distribution
B.2.2
Log-normal distribution
Appendix B. Distributions
161
E(X) Var(X) Vco(X)
Table B.4: Expectation, variance and variational coefficient of a N (, 2 )distributed random variable X
E(X)
2
+ 2
Var(X)
2
e2+ e
1
Vco(X)
2
e 1
Table B.5: Expectation, variance and variational coefficient of a LN (, 2 )distributed random variable X
B.2.3
Gamma distribution
c
x1 ecx 1(0,1) (x).
()
(B.7)
(B.8)
is called the Gamma function. The parameters and c are called shape and scale
respectively.
The Gamma function has the following properties
1) (1) = 1.
2) (1/2) =
3) ( + 1) = ().
c2
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
162
B.2.4
Appendix B. Distributions
Beta distribution
1
xa1 (1 x)b1 1(0,1) (x).
B(a, b)
(B.9)
(B.10)
Var(X)
a
a+b
ab
(a+b)2 (a+b+1)
Vco(X)
q
b
a(a+b+1)
Table B.7: Expectation, variance and variational coefficient of a Beta(a, b)distributed random variable X
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
Bibliography
[1] Abraham, B., Ledolter, J. (1983), Statistical Methods for Forecasting. John Wiley
and Sons, NY.
[2] Alba de, E. (2002), Bayesian estimation of outstanding claim reserves. North American Act. J. 6/4, 1-20.
[3] Alba de, E., Corzo, M.A.R. (2006), Bayesian claims reserving when there are negative values in the runoff triangle. Actuarial Research Clearing House, Jan 01, 2006.
[4] Alba de, E. (2006), Claims reserving when there are negative values in the runoff
triangle: Bayesian analysis using the three-parameter log-normal distribution. North
American Act. J. 10/3, 45-59.
[5] Arjas, E. (1989), The claims reserving problem in non-life insurance: some structural
ideas. ASTIN Bulletin 19/2, 139-152.
[6] Barnett, G., Zehnwirth, B. (1998), Best estimate reserves. CAS Forum, 1-54.
[7] Barnett, G., Zehnwirth, B. (2000), Best estimates for reserves. Proc. CAS, Vol.
LXXXII, 245-321.
[8] Benktander, G. (1976), An approach to credibility in calculating IBNR for casualty
excess reinsurance. The Actuarial Review, April 1976, p.7.
[9] Bernardo, J.M., Smith, A.F.M. (1994), Bayesian Theory. John Wiley and Sons, NY.
[10] Bornhuetter, R.L., Ferguson, R.E. (1972), The actuary and IBNR. Proc. CAS,
Vol. LIX, 181-195.
[11] Buchwalder, M., B
uhlmann H., Merz, M., W
uthrich, M.V. (2005), Legal valuation
portfolio in non-life insurance. Conference paper, presented at the 36th International
ASTIN Colloquium, 4-7 September 2005, ETH Z
urich. www.astin2005.ch
[12] Buchwalder, M., B
uhlmann H., Merz, M., W
uthrich, M.V. (2006), Estimation of
unallocated loss adjustment expenses. Bulletin SAA 2006/1, 43-53.
[13] Buchwalder, M., B
uhlmann H., Merz, M., W
uthrich, M.V. (2006), The mean square
error of prediction in the chain ladder reserving method (Mack and Murphy revisited). To appear in ASTIN Bulletin 36/2.
163
164
Bibliography
[22] Efron, B., Tibshirani, R.J. (1995), An Introduction to the Bootstrap. Chapman &
Hall, NY.
[23] England, P.D., Verrall, R.J. (1999), Analytic and bootstrap estimates of prediction
errors in claims reserving. Insurance: Math. Econom. 25, 281-293.
[24] England, P.D., Verrall, R.J. (2001), A flexible framework for stochastic claims reserving. Proc. CAS, Vol. LXXXIII, 1-18.
[25] England, P.D., Verrall, R.J. (2002), Stochastic claims reserving in general insurance.
British Act. J. 8/3, 443-518.
[26] Feldblum, S. (2002), Completing and using schedule P. CAS Forum, 353-590.
[27] Finney, D.J. (1941), On the distribution of a variate whose algorithm is normally
distributed. JRSS Suppl. 7, 155-161.
[28] Gerber, H.U., Jones, D.A. (1975), Credibility formulas of the updating type. In:
Credibility: Theory and Applications, P.M. Kahn (Ed.), Academic Press, NY.
[29] Gisler, A. (2006), The estimation error in the chain-ladder reserving method: a
Bayesian approach. To appear in ASTIN Bulletin 36/2.
[30] Gogol, D. (1993), Using
Math. Econom. 12, 297-299.
expected
loss
ratios
in
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
reserving.
Insurance:
Bibliography
165
[31] Hachemeister, C.A. (1975), Credibility for regression models with application to
trend. In: Credibility: Theory and Applications, P.M. Kahn (Ed.), Academic Press,
NY.
[32] Haastrup, S., Arjas, E. (1996), Claims reserving in continuous time; a nonparametric
Bayesian approach. ASTIN Bulletin 26/2, 139-164.
[33] Herbst, T. (1999), An application of randomly truncated data models in reserving
IBNR claims. Insurance: Math. Econom. 25, 123-131.
[34] Hertig, J. (1985), A statistical approach to the IBNR-reserves in marine insurance.
ASTIN Bulletin 15, 171-183.
[35] Hesselager, O., Witting, T. (1988), A credibility model with random fluctutations
in delay probabilities for the prediction of IBNR claims. ASTIN Bulletin 18, 79-90.
[36] Hesselager, O. (1991), Prediction of outstanding claims: A hierarchical credibility
approach. Scand. Act. J. 1991, 25-47.
[37] Hovinen, E. (1981), Additive and continuous IBNR. ASTIN Colloquium Loen, Norway.
[38] Jewell, W.S. (1976), Two classes of covariance matrices giving simple linear forecasts.
Scand. Act. J. 1976, 15-29.
[39] Jewell, W.S. (1989), Predicting IBNyR events and delays. ASTIN Bulletin 19/1,
25-55.
[40] Jones, A.R., Copeman, P.J., Gibson, E.R., Line, N.J.S., Lowe, J.A., Martin, P.,
Matthews, P.N., Powell, D.S. (2006), A change agenda for reserving. Report of
the general insurance reserving issues taskforce (GRIT). Institute of Actuaries and
Faculty of Actuaries.
[41] Jong de, E. (2006), Forecasting runoff triangles. North American Act. J. 10/2, 28-38.
[42] Jong de, P., Zehnwirth, B. (1983), Claims reserving, state-space models and the
Kalman filter. J.I.A. 110, 157-182.
[43] Jrgensen, B., de Souza, M.C.P. (1994), Fitting Tweedies compound Poisson model
to insurance claims data. Scand. Act. J. 1994, 69-93.
[44] Kremer, E. (1982), IBNR claims and the two way model of ANOVA.
Scand. Act. J. 1982, 47-55.
[45] Larsen, C.R. (2005), A dynamic claims reserving model. Conference paper, presented
at the 36th International ASTIN Colloquium, 4-7 September 2005, ETH Z
urich.
www.astin2005.ch
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
166
Bibliography
[46] Lyons, G., Forster, W., Kedney, P., Warren, R., Wilkinson, H. (2002),
Claims reserving working party paper. General Insurance Convention 2002.
http://www.actuaries.org.uk
[47] Mack, T. (1990), Improved estimation of IBNR claims by credibility. Insurance:
Math. Econom. 9, 51-57.
[48] Mack, T. (1991), A simple parametric model for rating automobile insurance or
estimating IBNR claims reserves. ASTIN Bulletin 21/1, 93-109.
[49] Mack, T. (1993), Distribution-free calculation of the standard error of chain ladder
reserve estimates. ASTIN Bulletin 23/2, 213-225.
[50] Mack, T. (1994), Measuring the variability of chain ladder reserve estimates. CAS
Forum (Spring), 101-182.
[51] Mack, T. (2000), Credible claims reserves: The Benktander method. ASTIN Bulletin
30/2, 333-347.
[52] Mack, T., Quarg, G., Braun, C. (2006), The mean square error of prediction in the
chain ladder reserving method - a comment. To appear in ASTIN Bulletin 36/2.
[53] McCullagh, P., Nelder, J.A. (1989), Generalized Linear Models. 2nd edition, Chapman & Hall, London.
[54] Merz, M., W
uthrich, M.V. (2006), A credibility approach to the Munich chain-ladder
method. To appear in Blatter DGVFM.
[55] Murphy, D.M. (1994), Unbiased loss development factors. Proc. CAS, Vol. LXXXI,
154-222.
[56] Neuhaus, W. (1992), Another pragmatic loss reserving method or Bornhuetter/Ferguson revisited. Scand. Act. J. 1992, 151-162.
[57] Neuhaus, W. (2004), On the estimation of outstanding claims. Conference paper,
presented at the 35th International ASTIN Colloquium 2004, Bergen, Norway.
[58] Norberg, R. (1993), Prediction of outstanding liabilities in non-life insurance. ASTIN
Bulletin 23/1, 95-115.
[59] Norberg, R. (1999), Prediction of outstanding liabilities II. Model variations and
extensions. ASTIN Bulletin 29/1, 5-25.
[60] Ntzoufras, I., Dellaportas, P. (2002), Bayesian modelling of outstanding liabilities
incorporating claim count uncertainty. North American Act. J. 6/1, 113-128.
[61] Partrat, C., Pey, N., Schilling, J. (2005), Delta method and reserving. Conference
paper, presented at the 36th International ASTIN Colloquium, 4-7 September 2005,
ETH Z
urich. www.astin2005.ch
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
Bibliography
167
[62] Quarg, G., Mack, T. (2004), Munich Chain Ladder. Blatter DGVFM, Band XXVI,
597-630.
[63] Radtke, M., Schmidt, K.D. (2004), Handbuch zur Schadenreservierung. Verlag Versicherungswirtschaft, Karlsruhe.
[64] Renshaw, A.E. (1994), Claims reserving by joint modelling. Actuarial research paper
no. 72, Department of Actuarial Sciences and Statistics, City University, London.
[65] Renshaw, A.E., Verrall, R.J. (1998), A stochastic model underlying the chain ladder
technique. British Act. J. 4/4, 903-923.
[66] Ross, S.M. (1985), Introdution to Probability Models, 3rd edition. Academic Press,
Orlando Florida.
[67] Sandstrom, A. (2006), Solvency: Models, Assessment and Regulation. Chapman
and Hall, CRC.
[68] Schmidt, K.D., Schaus, A. (1996), An extension of Macks model for the chain-ladder
method. ASTIN Bulletin 26, 247-262.
[69] Schnieper, R. (1991), Separating true IBNR and IBNER claims. ASTIN Bulletin 21,
111-127.
[70] Scollnik, D.P.M. (2002), Implementation of four models for outstanding liabilities in
WinBUGS: A discussion of a paper by Ntzoufras and Dellaportas. North American
Act. J. 6/1, 113-128.
[71] Smyth, G.K., Jrgensen, B. (2002), Fitting Tweedies compound Poisson model to
insurance claims data: dispersion modelling. ASTIN Bulletin 32, 143-157.
[72] Srivastava, V.K., Giles, D.E. (1987), Seemingly Unrelated Regression Equation Models: Estimation and Inference. Marcel Dekker, NY.
[73] Swiss Solvency Test (2005), BPV SST Technisches Dokument, Version 22.Juni 2005.
Available under www.sav-ausbildung.ch
[74] Taylor, G. (1987), Regression models in claims analysis I: theory. Proc. CAS,
Vol. XLVI, 354-383.
[75] Taylor, G. (2000), Loss Reserving: An Actuarial Perspective. Kluwer Academic
Publishers.
[76] Taylor, G., McGuire, G. (2005), Synchronous bootstrapping of seemingly unrelated
regressions. Conference paper, presented at the 36th International ASTIN Colloquium, 4-7 September 2005, ETH Z
urich. www.astin2005.ch
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
168
Bibliography
[77] Venter, G.G. (1998), Testing the assumptions of age-to-age factors. Proc. CAS,
Vol. LXXXV, 807-847.
[78] Venter, G.G. (2006), Discussion of MSEP in the CLRM (MMR). To appear in
ASTIN Bulletin 36/2.
[79] Verrall, R.J. (1990), Bayesian and empirical Bayes estimation for the chain ladder
model. ASTIN Bulletin 20/2, 217-238.
[80] Verrall, R.J. (1991), On the estimation of reserves from loglinear models. Insurance:
Math. Econom. 10, 75-80.
[81] Verrall, R.J. (2000), An investigation into stochastic claims reserving models and
the chain-ladder technique. Insurance: Math. Econom. 26, 91-99.
[82] Verrall, R.J. (2004), A Bayesian generalized linear model for the BornhuetterFerguson method of claims reserving. North American Act. J. 8/3, 67-89.
[83] Verdier, B., Klinger, A. (2005), JAB Chain: A model-based calculation of paid and
incurred loss development factors. Conference paper, presented at the 36th International ASTIN Colloquium, 4-7 September 2005, ETH Z
urich. www.astin2005.ch
[84] Vylder De, F. (1982), Estimation of IBNR claims by credibility theory. Insurance:
Math. Econom. 1, 35-40.
[85] Vylder De, F., Goovaerts, M.J. (1979), Proceedings of the first meeting of
the contact group Actuarial Sciences. KU Leuven, nl. 7904B wettelijk report:
D/1979/23761/5.
[86] Wright, T.S. (1990), A stochastic method for claims reserving in general insurance.
J.I.A. 117, 677-731.
[87] W
uthrich, M.V. (2003), Claims reserving using Tweedies compound Poisson model.
ASTIN Bulletin 33/2, 331-346.
[88] W
uthrich, M.V. (2006), Premium liability risks: modeling small claims. Bulletin
SAA 2006/1, 27-38.
[89] W
uthrich, M.V. (2006), Using a Bayesian approach for claims reserving. Accepted
for publication in CAS Journal.
[90] W
uthrich, M.V. (2006), Prediction error in the chain ladder method. Preprint.
[91] W
uthrich, M.V., B
uhlmann, H., Furrer, H. (2006), Lecture notes on market consistent actuarial valuation. ETH Z
urich, Summer Term 2006.
[92] Zehnwirth, B. (1998), ICRFS-Plus 8.3 Manual. Insureware Pty Ltd. St. Kilda, Australia.
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)
Bibliography
169
c
2006
(M. W
uthrich, ETH Z
urich & M. Merz, Uni T
ubingen)