Вы находитесь на странице: 1из 10

Exam 4/C Notes

0.1 Functions, Moments and Functions of Moments


The distribution function F (x) is defined as F (x) = P r(X x).

The survival function S(x) is defined as S(x) = 1 F (x).

f (x) d
The hazard rate function h(x) is defined as h(x) = S(x) = dx (ln S(x)).
Rx
The cumulative hazard rate function H(x) is defined as H(x) =
h(t)dt = ln Sx).

The nth raw moment 0n is defined as 0n = E[X n ].

The nth central moment n is defined as n = E[(X )n ].

The variance Var[X] is defined as Var[X] = E[(X )2 ] = 2 .


p p
The standard deviation is defined as = V ar[X] = E[(X )2 ].

3 E[(X )3 ]
The skewness 1 is defined as 1 = = .
3 E[(X )2 ]3/2

4 E[(X )4 ]
The kurtosis 2 is defined as 2 = = .
4 E[(X )2 ]2

The coefficient of variation is .

The covariance is Cov[X, Y ] = E[(X X )(Y Y )] = E[XY ] E[X]E[Y ].

Cov[X, Y ]
The correlation coefficient XY is defined as XY = .
X Y

The 100pth percentile p is any number such that F (p ) = p. Another name for p is Value-at-Risk at security
level p and is also denoted by VaRp (X).

The median is 0.5 .

The mode is x which maximizes f (x).

The moment generating function MX (t) is defined as MX (t) = E[etX ].


(n)
The raw moments are the derivatives of moment generating function at the origin: 0n = E[X n ] = MX (0).

The probability generating function PX (t) is defined as PX (t) = E[tX ] = MX (ln t).
1 (n)
The Taylor coefficients of PX (t) are the probabilities of X: pn = P r(X = n) = n! PX (0).

A sample is a set of observations from n independent and identically distributed random variables.

A sample mean is the mean of the sample: sum of the observations divided by number of observations.

The covariance matrix of n random variables X1 , X2 , . . . , Xn is the matrix [aij ]i,j = [Cov[Xi , Xj ]]i,j . The covari-
ance matrix is symmetric and positive semi-definite.

1
0.2 Useful Results from Probability Theory
Bayes Theorem

Discrete: P r(A|B)P r(B) = P r(B|A)P r(B).


Continuous: fX (x|y)fY (y) = fY (y|x)fX (x).

Law of Total Probability


X
Discrete: If Bi s are a partition of the space, then P r(A) = P r(Bi )P r(A|Bi ).
i
R
Continuous: P r(A) = P r(A|x)f (x) dx.

Conditional Mean Formula: EX [g(X)] = EI [EX [g(X)|I]].

Conditional Variance Formula: V ar[X] = EI [V arX [X|I]] + V arI [EX [X|I]].

Compound Mean and Variance Formula: If Xi s are independent and identically distributed, N is independent
of each Xi , and S = X1 + X2 + + XN , then
"N # "N #
X X
PS (z) = PN (PX (z)), E Xi = E[N ]E[X], V ar Xi = E[N ]V ar[X] + V ar[N ]E[X]2 .
i=1 i=1

(The mean and variance formulas follow from differentiating the pgf).
1
Variance of Sample Mean: V ar[X] = V ar[X].
n
Bernoulli Shortcut: If X is Bernoulli with values a or b, with probabilities q or 1q, then V ar[X] = (ab)2 q(1q).

Continuity Correction: If X is discrete and we want to approximate X using a continuous variable Y , then we
make the following continuity correction: If a and b are two consecutive values of X and c (a, b), then

P r[X a] = P r[X < c] = P r[X < b] = FY ( 12 (a + b)).


P r[a < X] = P r[c < X] = P r[b X] = 1 FY ( 12 (a + b)).

0.3 The Linear Exponential Family


p(x)exr()
A distribution is said to be in the linear exponential family if its density function has the form f (x) = .
q()
q 0 () 0 ()
If X is a member of the linear exponential family, then E[X|] = () = r 0 ()q() and Var[X|] = r 0 () .

0.4 Modified Variables for Insurance


In the following, let X denote the loss variable.

The limited loss variable of X (with limit u) is X u = min{X, u}.


Z u
The mean of the limited loss variable is called the limited expected value and is given by E[X u] = S(x)dx.
0

If Y is the payment variable with an ordinary deductible d, then Y = max{0, X d}. This variable is called the
payment per loss variable and is denoted by Y L = (X d)+ . We have FY L (x) = FX (x + d).

X = (X d) + (X d)+ = (X d) + Y L .
FX (x + d) FX (d)
The payment per payment variable Y P is defined as Y P = (Xd)+ |X > d. We have FY P (x) = .
1 FX (d)

2
The mean of Y P is denoted by eX (d) and is called the mean excess loss.

E[X] = E[X d] + e(d)S(d).


E[X d]
The loss elimination ratio is .
E[X]
The Tail-Value-at-Risk at security level p is TVaRp (X) = E[X|X > p ] = p + e(p ).

0.5 Coherent Risk Measures


A real valued function of random variables is said to be a coherent risk measure, if it has the following 4 properties:

Translation Invariance: (X + c) = (X) + c.

Positive Homogenity: c > 0 (cX) = c(X).

Subadditivity: (X + Y ) = (X) + (Y ).

Monotonicity: P r(X Y ) = 1 (X) (Y ).

0.6 The (a, b, 0) and (a, b, 1) Classes


b

Any discrete distribution satisfying pk = a + k pk1 , k 1 is said to be in the (a, b, 0) class.
b

Any discrete distribution satisfying pk = a + k pk1 , k 2 is said to be in the (a, b, 1) class.

The constant a is 0 for poisson, negative for binomial and positive for negative binomial.

If pk and pk are two distributions in the (a, b, 1) class (with the same a and b) with pgf P (z) and P (z) respectively,
then (P (z) p0 )(1 p0 ) = (P (z) p0 )(1 p0 ).

If pk and pk are two distributions in the (a, b, 1) class (with the same a and b), then pk (1 p0 ) = pk (1 p0 ).

A member of the (a, b, 1) class is said to be zero-truncated if p0 = 0, otherwise it is said to be zero-modified.

0.7 Poisson/Gamma
If Loss is poisson with parameter and is gamma with parameters (, ), then the unconditional loss frequency
for an insured is negative binomial with parameters r = and = . (Follows from looking at pgf)

0.8 Coverage Modifications


If loss frequency has pgf of the form B((z 1)), and the coverage is modified so that the probability of a payment
is v, then the payment frequency has pgf B(v(z 1)).

If a per loss deductible d is introduced, then we can find the new expected annual aggregate payments by modiying
the loss frequency distribution to payment frequency, changing loss severity to payment per payment and

E[annual aggreate payments] = E[number of payments]E[payment per payment].

3
0.9 Aggregate Claims when Severity is Discrete
If S jumps by h 6= 1, then we can study S 0 = S/h and S 0 has integer jumps. We have E[S] = hE[S 0 ]. Thus assume
S jumps by integers.
Recursive Formula: Let frequency N be in (a, b, 1) class, X denote severity which is discrete and S denote the
aggregate loss. Let fk = P r(X = k), pk = P r(N = k), and gk = P r(S = k). Then g0 = PN (f0 ) and
" k   #
1 X j
gk = (p1 (a + b)p0 )fk + a+b fj gkj
1 af0 i=1
k

If d is the aggregate deductible and S is the aggregate losses, then the net stop-loss premium is defined as
E[(S d)+ ] = E[S] E[S d].
To calculate net stop-loss premium, we can first calculate E[S d] using
dde1 d
X X
E[S d] = kgk + dP r(S d) = d (d k)gk .
k=0 k=0

The net stop-loss premium E[(S x)+ ] is a linear function of x between possible values of S: If S assumes no values
between d and u and x (d, u), then
ux xd
E[(S x)+ ] = E[(S d)+ ] + E[(S u)+ ].
ud ud

0.10 Interval Estimators


Interval estimators are estimators whose values are intervals.

b .
Bias: biasb() = E[|]

Unbiased estimator: b is unbiased biasb() = 0 E[|]


b = .

The sample mean and sample variance are unbiased estimators of population mean and variance.
Asymptotically unbiased estimator: b is asymptotically unbiased lim biasbn () = 0 lim E[bn |] = .
n n

Consistent (Weakly consistent): b is consistent (AKA weakly consistent) > 0, lim P r(|bn | < ) = 1.
n

A sufficient condition for consistency is: biasbn () 0 and Var[bn ] 0 b is consistent.

Mean square error: M SEb[] = E[(b )2 |] = Var[]


b + (bias b())2 .

Confidence Interval: the 100p% confidence interval of an estimator b of a parameter (using normal approxima-
b
tion) is the solution of p 1 ((1 + p)/2).

v()
2
n n n
2 1 X 2 1 X 1X
Proof that the sample variance S = (Xi x) = Xi Xj is an unbiased estimator of
n 1 i=1 n 1 i=1 n j=1
the population variance:
"
n
# " n # n
" n # n
X X X X X
2 2 2 2 2
E[(n 1)S ] = E (Xi x) = E (Xi 2xXi + x ) = E[Xi ] 2E x Xi + E[x2 ]
i=1 i=1 i=1 i=1 i=1
= nE[X 2 ] 2E[nx2 ] + nE[x2 ] = nE[X 2 ] nE[x2 ] = nE[X 2 ] n Var[x] + E[x]2


n n
!
1 X 1 X
= nE[X 2 ] n Var[ Xi ] + E[ Xi ]2 = nE[X 2 ] Var[X] nE[X]2 = (n 1)Var[X].
n i=1 n i=1

4
0.11 Variance of Empirical Estimators: Complete individual data
For a sample of size n and any interval I, the variance of the empirical estimator of P r(X I) is given by

d r(X I)) = (in prob)(out prob) = (# in)(# out) .


Var(P
n n3

0.12 Variance of Empirical Estimators: Complete grouped data


Let I be an interval with end points a, b.

Case 1: a and b are end points of groups: In this case we can find variance just the way we do for complete
d r(X I)) = 13 (# in I)(# out I)
individual data. Var(P n

Case 2: If a is an end point of a group but b is not: Let J = (ci , ci+1 ) be the group containing b. Let
bci
K = (a, ci ) and = ci+1 ci . Then using linear approximation we have

d r(X I)] = Var[P r(Xn K) + P r(Xn J)]


Var[P
= Var[P r(Xn K)] + 2Cov[P r(Xn K), P r(Xn J)] + 2 Var[P r(Xn J)]
1 
= 3 (# in K)(# out K) 2(# in J)(# in K) + 2 (# in J)(# out J)

n

Case 3: If b is an end point of a group but a is not: Similar to case 2.

Case 4: Neither a nor b is an end point of a group: Complicated to write, but has similar idea.

Case 5: If a = b: Let I = (ci , ci+1 ) be the group containing a. Then


 
1 (# in I)(# out I)
Var[P r(X = a)] = Var[P r(Xn = a)] = Var
d P r(Xn I) =
|I| |I|2 n3

0.13 Kaplan-Meier Product Limit Estimator


Assumption: New entries tied to an event time do not count but withdrawals and deaths tied to an event time do count.
min{j:t<yj+1 }  
Y si
Survival function: Sn (t) = 1 .
i=1
ri
min{j:t<yj+1 }
si
X
Variance (Greenwoods Formula): Var(Sn (t)) = Sn (t)2 .
i=1
ri i si )
(r
 
1/U U 1 1+p VarSn (t)
log-transformed confidence interval: (Sn (t) , Sn (t) ) where U = exp ( 2 ) sn (t) ln Sn (t) .

0.14 Nelson-Alen Estimator


Assumption: New entries tied to an event time do not count but withdrawals and deaths tied to an event time do count.
min{j:t<yj+1 }
X si
Cumulative hazard function: Hn (t) = .
i=1
ri
min{j:t<yj+1 }
X si
Variance of chf : Var(Hn (t)) = .
i=1
ri2
 
1 1+p VarHn (t)
log-transformed confidence interval: (Hn (t)/U, Hn (t)U ) where U = exp ( 2 ) Hn (t) .

5
0.15 Kernel-Smoothed Distributions: Uniform Kernel
If Y is the empirical distribution and X is the kernel-smoothed distribution using uniform kernel with bandwidth b, then
b2
Var[X] = Var[Y ] + .
3

0.16 Kernel-Smoothed Distributions: Triangular Kernel


If Y is the empirical distribution and X is the kernel-smoothed distribution using triangular kernel with bandwidth b,
then
b2
Var[X] = Var[Y ] + .
6

0.17 Approximations for Large Data Sets using Kaplan-Meier Estimator


Notations:
Pj = Population at the start of jth interval.
dj = New entrants in the jth interval.
uj = Withdrawals in the jth interval.
xj = Deaths in the jth interval.
rj = Risk set in the jth interval.
xj
qj0 = Decrement rate in the jth interval = .
rj
All entries and withdrawals at end points: Pj+1 = Pj + dj uj xj , rj = Pj + dj uj .
All entries and withdrawals uniformly distributed: Pj+1 = Pj + dj uj xj , rj = Pj + 0.5(dj uj ).

0.18 Fitting a Distribution: Method of Moments


Pareto: We can fit to Pareto using first and second moments only if E[X 2 ] > 2E[X]2 .

0.19 Fitting a Distribution: Percentile Matching


Smoothed empirical percentile: The 100pth smoothed empirical percentile of n observations is x(n+1)p where xk is
the kth order statistic. If (n + 1)p is not an integer then we interpolate.

0.20 Fitting a Distribution: Maximum Likelihood


Idea: Maximize the probability of observing the observation to find parameter(s).
Exact exposure
Exponential: MLE[] = .
# of uncensored observations
Uniform on [0, ]: MLE[] = max{xi }.

0.21 Variance of MLE


Information matrix for one variable when density function is given: Suppose we have n random observa-
tions from a distribution with density function f (x) = f (x|). Then the Information matrix is
 2  Z  2 

I() = nEX ln f (x|) = n ln f (x|) f (x|) dx
2 0 2

6
and the variance of the maximum likelihood estimator of is I()1 .

Information matrix for two variable: If l(1 , 2 ) is the log of the likelihood of the observation, then the
information matrix I(1 , 2 ) is defined as

2l
   
l l
I(1 , 2 ) = EX = EX
i j i,j i j i,j

Asymptotic covariance matrix of MLE for two variables: Asymptotic covariance matrix of MLE for two
variables is the inverse of its information matrix.

Covariance matrix = (Information matrix)1 .

Delta Method for Variance of a function: If 1 and 2 are estimated using MLE and g(1 , 2 ) is some function
of 1 and 2 , then an estimate of Var[g(1 , 2 )] is
 
g1
Var[g(1 , 2 )] [g1 , g2 ] [Covariance Matrix] = g21 Var[1 ] + 2g1 g2 Cov[1 , 2 ] + g22 Var[2 ]
g2

where all the values are evaluated at the estimated values of 1 and 2 .

Confidence interval of MLE: p The confidence interval for a function g(1 , 2 ) where 1 and 2 are estimated
using MLE, is g(1 , 2 ) zp Var[g(1 , 2 )]. The values are evaluated at the estimate.

True information vs Observed information:

0.22 Graphic comparison of Fitted distribution to Empirical


D(x) plot: is the graph of D(x) = Fn (x) F (x) where Fn is the empirical distribution and F is the fitted distribution.

j
p p plot: is the graph of the points ( n+1 , F (xj )), j = 1, 2, . . . , n.

0.23 Kolmogorov-Smirnov Statistic


Kolmogorov-Smirnov statistic: is max |Fn (x) F (x)| where Fn is the empirical distribution and F is the fitted
x
distribution.

Hypothesis test rejection and acceptance levels: Reject below and accept above.

0.24 Anderson-Darling Statistic


Anderson-Darling Statistic:
u
(Fx (x) F (x))2
Z
A2 = n f (x) dx
t F (x)S (x)
k
X k
X
= nF (u) + n Sn (yj )2 [ln S (yj ) ln S (yj+1 )] + n Fn (yj )2 [ln F (yj ) ln F (yj+1 )] .
j=0 j=1

7
0.25 Chi-square Goodness of Fit Test
Chi-square Statistic:
k k k
X (Oj Ej )2 X Oj2 X (Oj Ej )2
Q= = n=
j=1
Ej j=1
Ej j=1
Vj

where Oj = #(observed observations in group j), Ej = #(expected observations in group j), k =#(groups),
n =#(observations), Vj =Expected variance in group j.
Read question carefully to find the degrees of freedom.
If the data is not given in grouped form, then group them so that each group has at least 5 expected observations.
Degrees of freedom = #(groups) - #(estimated parameters) - #(restrictions).

0.26 Likelihood Ratio Algorithm for Choosing a Model


A model with r + s degrees of freedom is chosen over a model with r degrees of freedom only if

2(loglikelihood(r + s model) loglikelihood(r model)) Chi-square with s degrees of freedom.

0.27 Schwarz-Bayesian Algorithm for Choosing a Model


If the sample has n observations then choose the model which maximizes
ln n
loglikelihood (degrees of freedom) .
2

0.28 Limited Fluctuation Credibility


Exposure needed so that average aggregate claims/losses is within 100r% of expected aggregate claims/losses 100p%
 2
y 2
of the time is e rp = n0 CV 2 where CV is the coefficient of variation of the aggregate claims.

Exposure needed (claims needed) so that average claim size is within 100r% of expected claim size 100p% of the
 2
y 2
time is e rp = n0 CV 2 where CV is the coefficient of variation of the severity distribution.

Exposure needed so that average number of claims is within 100r% of expected number of claims 100p% of the time
 2
y 2
is e rp = n0 CV 2 where CV is its coefficient of variation of the frequency distribution.

Number of claims = (exposure)(mean of frequency).


If frequency is Poisson with mean and severity S has mean has coefficient of variation CVS , then the coefficient
1 E[S 2 ] 1
of variation of the aggregate distribution Y satisfies CVY2 = 1 + CVS2 .

2
=
E[S]

0.29 Limited Fluctuation Credibility: Partial Credibility


q
n
The partial credibility is given by PC = M + Z(x M ) = M + nF (x M ) where n is the number of new observations
and nF is the number of expected claims needed for full credibility.

8
0.30 Bayesian Estimation and Credibility: Continuous Prior
X has a parameter and has density () called prior. After making n observations x1 , x2 , . . . , xn of X the distribution
of changes (called posterior) and has density
posterior() = #prior()(likelihood of x1 , x2 , . . . , xn given )
where # is chosen so that the total integral is 1.
Expectation of X is called the Bayesian premium and is found using E[X] = E [EX [X|]].

0.31 Bayesian Credibility: Poisson/Gamma


Suppose X is poission with parameter and is gamma with parameters , = 1/. Let n denote the number of
exposures and m be the number of claims. Then the posterior distribution of is also gamma with parameters = + n
and = + m. (alpha increases with claims and gamma increases with exposure, AC).
The mean of the gamma distribution is and variance is 2 .

0.32 Bayesian Credibility: Normal/Normal


Suppose X is normal with mean and fixed variance v, and be normal with mean and variance a. Let x1 , x2 , . . . , xn
be an experience. Then the posterior is proportional to
n
!
( )2 (xi )2
     
X 1 n 2
 nx 
exp exp = (constat) exp + + + 2 .
2a i=1
2v 2a 2v 2a 2v

2a + nx
2v v + nax 1 1 va
Thus the posterior is normal with mean 1 n = v + na and variance 2 1 n = .
2a + 2v 2a + 2v
v + na

0.33 Bayesian Credibility: Exponential/Inverse Gamma


Suppose the model X is exponential with parameter and is inverse gamma with parameters ad . Let x1 , x2 , . . . , xn
be an experience of X. Then the posterior is proportional to
n
!
Y
1 exi / (+1) e/ = (+n+1) e(+nx)/ .
i=1

Thus the posterior is inverse gamma with parameters = + n and = + nx.

0.34 Bayesian Credibility: Bernoulli/Beta


Suppose the model X is Bernoulli with q being the probability of a success and q is beta with parameters a, b. Suppose
the experience is k success in n trials, then the posterior is proportional to
q k (1 q)nk q a1 (1 q)b1 = q a+k1 (1 q)b+nk1 .
Thus the posterior is beta with parameters a = a + k and b = b + n k.

0.35 Buhlman Credibility


E[V ar[X|]]
If the model X has parameter , then the Buhlman k is defined as k = and the Buhlman credibility factor
V ar[E[X|]]
is Z = n/(n + k).
The Buhlman credibility estimate is + Z(x ).

9
0.36 Buhlman-Straub Credibility
This is the same as Buhlman credibility, except that the exposure per period is not 1.

0.37 Empirical Bayes Non-parametric Methods (Uniform Exposures)



b = mean of the means,
vb = average of sample variances,
a = estimated variance of sample means vb/n,
b
vb
k= ,
a
b

0.38 Empirical Bayes Non-parametric Methods (Non-uniform Exposures)


r = number of groups
ni = number of years group i is observed
mij = Number of policyholders in group i in year j
X
mi = mij = number of exposure-years in group i
j
X
m= mi = total number of exposure-years
i
Xij = average per policyholder in group i for year j
Xbi = average per policyholder for group i over all years
X
b = average per policyholder over all groups
P
i,j mij Xij

b=
P m 2
i,j mij (Xij Xi )
vb =
nr
P bi X)
mi (X b 2 vb(r 1)
a= i
b 1
P 2 .
m m i mi

0.39 Empirical Bayes Semi-Parametric: Poisson


If the model is Poisson with only one period of observations (n = 1), then the Buhlman credibility is calculated as follows:
r a
r = number of observations, = v = x, a= 2 v, Z = .
r1 v+a

0.40 Empirical Bayes Semi-Parametric: Negative Binomial


r a
If the model is Negative binomial with fixed , then = x, v = (1 + )x, a= 2 v, Z= .
r1 v+a

0.41 Empirical Bayes Semi-Parametric: Gamma


r a
If the model is Gamma with fixed , then = x, v = x, a= 2 v, Z= .
r1 v+a

10

Вам также может понравиться