Академический Документы
Профессиональный Документы
Культура Документы
via Simulation
Abstract
We show how, from a single simulation run, to estimate the ruin probabilities
and their sensitivities (derivatives) in a classic insurance risk model under various
distributions of the number of claims and the claim size. Similar analysis is given for
the tail probabilities of the accumulated claims during a xed period. We perform
sensitivity analysis with respect to both distributional and structural parameters of
the underlying risk model. In the former case we use the score function method and in
the latter - a combination of the push-out method and the score function. We nally
show how, from the same sample path, to derive a consistent estimator of the optimal
solution in an optimization problem associated with excess-of-loss reinsurance.
Department of Mathematical Statistics, University of Lund, Box 118, 221 00 Lund, Sweden, E-mail:
asmus@maths.lth.se
y William Davidson Faculty of Industrial Engineering and Management, Technion, Haifa, Israel,
1
1 Introduction
This paper deals with sensitivity analysis and stochastic optimization of performance mea-
sures associated with insurance risk models. We assume that the claims arrive according to a
Poisson process fN (t) : t 0 g with rate > 0 and that the claim sizes are iid non-negative
random variables Ui; i = 1; 2; : : : with cumulative distribution function B () and mean B .
Assume further that the sequence fUig and the Poisson process fN (t)g are independent,
and consider the following compound Poisson risk process with state{dependent premium,
Zt
R(t) = u A(t) + p(R(s))ds; 0
(1.1)
where NX
(t)
A(t) = Ui : (1.2)
i=1
Here u; N (t); A(t), p(x) are called the initial reserve, the number of claims in (0; t), the
accumulated (total) claim and the premium rate at level x, respectively.
Standard performance measures are the ruin probability
= IP(inf
t 0
R(t) < 0); (1.3)
the expected utility
v = IEfV (R(t))g (1.4)
after a time period of t (V being some concave utility function), and the tail probability
` = IP(A(t) > x); (1.5)
of the accumulated claims.
Sensitivity analysis is concerned with evaluating derivatives (gradients, Hessians, etc) of
performance measures with respect to parameters of interest. It provides insight and guid-
ance for the decision maker and plays a pivotal role in identifying the most signicant system
parameters. For example, the sensitivities w.r.t. (with respect to) the Poisson intensity of
performance measures ; v and ` are dened as the partial derivatives
= @@ ; v = @ v; ` = @ `; (1.6)
@ @
respectively. In addition to Poisson rates, one might be interested in sensitivities w.r.t.
2
1. Parameters of the claim size distribution, like Gamma, Pareto or inverse Gaussian.
2. Parameters of claim arrival point processes more complicated than the Poisson process,
like Markov{modulated Poisson processes (Asmussen, 1989) or randomly
uctuating
portfolios (see Section 5 below).
3. Parameters of the premium rule p(x), for example the vector (p ; p ; v) in the following
1 2
(say the company increases the premium if the reserve becomes dangerously low), or
p; for the case p(x) = p + x where p is the net premium and the interest rate.
4. The parameter a, called the retention limit in excess{of{loss reinsurance (the claim
carried by the insurer is min(a; U ) rather than U , see (3.15) below).
Sensitivities are also of obvious relevance if the parameter is only partially known, say esti-
mated from data, see Heidelberger & Towsley (1989) and Rubinstein & Shapiro (1993), pp.
96{100.
In areas like inventories, queues, teletrac systems and computer networks, where the
topic has recently received considerable attention, e.g. Devetsikiotis & Townsend (1993),
Glasserman & Kou (1995), Heidelberger (1995), Heidelberger & Towsley (1989), Kovalenko
(1995), Kriman & Rubinstein (1997), P
ug & Rubinstein (1996), Rubinstein (1992), Rubin-
stein & Shapiro (1993), and Shapiro (1996), examples of sensitivity analysis and optimiza-
tionin in actuarial mathematics are few. Exceptions are Asmussen (1999), who computes
Cramer{Lundberg type approximations for the sensitivities of ruin probabilities, and Van
Wouve, De Vylder & Goovaerts (1983), who study some aspects connected with reinsurance.
3
Due to the complexity of the systems of interest in these areas, analytical results are usually
not feasible and one must resort to Monte Carlo (MC) simulation. Simulation is also the
vehicle of this paper, and we proceed to explain some basic elements of our approach.
where X ; : : :; Xn are i.i.d. replicates of X and similarly the estimator for the sensitivity
1
` (x) w.r.t. some parameter . Usually X is accompanied by a condence interval based on
the sample variance of the Xi .
The approach we adopt here for estimating sensitivities is via the score function (SF)
method (e.g. Rubinstein & Shapiro (1993)), namely it involves expressions using the score
function (SF) and assumes interchangeability of the operators expectation and dierentia-
tion, which can be readily checked (in each case below) by using the Lebesgue`s dominated
convergence theorem, cf. Rubinstein & Shapiro (1993).
We illustrate this by a simple example, the sensitivity of the expected utility v w.r.t. .
Assume for simplicity that t = 1 and that a premium of p is charged in [0; 1] so that
X1 nZ1
v = e n! V (u + p x)B n(dx)
n=0 0
where u = R(0) and B n denotes the nth convolution power. Applying dominated conver-
gence and using straightforward dierentiation we obtain
dv = v
d ! nZ1
X1 n
= e
1 n ! V (u + p x)B n(dx)
"n ! #
0
=0
N
= IE 1 V (R(1)) :
Thus, the crude Monte Carlo (CMC) estimator of v is X = SV (R(1)), where S = N= 1
is the SF.
4
In typical applications, the parameter of interest is a vector rather than a scalar ( ) as
here, and the sensitivity is therefore a gradient vector (examples are given later in the paper).
The method also applies to higher order derivatives, but we do not discuss this here.
There is, however, one crucial assumption for the SF method: likelihood ratios must exist
which requires absolute continuity (roughly, the supports must be the same). Some of the
examples above do not exhibit this behavior. For example in excess{of{loss reinsurance, in-
volving random variables of type Y =min(a; U ), the distribution is not absolutely continuous
at y = a (similar problems occur in the setting (1.7)).
To overcome such diculties, a method called push{out (see Rubinstein, 1992) has been
developed. In addition to push-out we shall use a conditional Monte Carlo technique similar
to that suggested by P
ug & Rubinstein (1996). Both methods will be introduced via
worked{out examples in the paper.
The rest of the paper is organized as follows. In Section 3, we apply the SF method for
estimating the tail probability `(x) in (1.5) and the associated sensitivities (w.r.t. a variety of
variables such as , the parameters of the distribution function, x and the retention limit a in
excess{of{loss reinsurance). Section 4 deals with sensitivity analysis of the ruin probabilities
in (1.3). Section 5 treats the case where the portfolio size
uctuates at random, namely
we assume that the size of the portfolio at time t is given by the state M (t) of a birth{
death process. Section 6 shows how to estimate the parameter vector in the IS density in
an optimal way. Finally, Section 7 deals with an optimization problem involving a constant
premium rule in an excess-of-loss reinsurance model.
5
2 Rare events and asymptotically ecient simulation
For two functions k(x); `(x) 0 tending to 0 as x ! 1, we will write k ` if log
6
`(x) or IEX (x) `(x) , and in practice is always met in this form. We refer to Heidelberger
2 2 log 2
(1995) or Asmussen & Rubinstein (1995) for surveys of these and other aspects of rare events
simulation.
To derive fast estimators such as in (2.1){(2.3), the most standard tool is IS and ECM.
We illustrate this via a trivial example. Consider a single r.v. C with distribution F . Assume
that we want to estimate
`(x) = IP(C > x) = IEfI (C > x)g: (2.4)
A CMC estimator of `(x) is X (x) = I (C > x): According to IS we simulate from a dif-
ferent distribution, say G, making the rare event fC > xg more likely, and represent the
performance `(x) and its estimator as
`(x) = IEG fI (C > x)W g ; and X (x) = I (C > x)W; (2.5)
respectively. Here W = dG
dF (C ) is the likelihood ratio (LR). For example, assume that C has
dF (C ) = e
W = W () = dF C F^ [] ;
where F^ [] = IE eC is the moment generating function (MGF), and F and are called
the exponential change of measure (ECM) and the reference parameter, respectively. When
emphasis is on dependence of the LR on the parameter , we write
W (0 j ) = dF
dF :
0
Now consider the sensitivity of `(x) in (2.4) w.r.t. some parameter (say = , the
Poisson rate). We have ` (x) = IE[S ; C > x] where
S = dd log f (C; )
is the score function an, f (; ) is the probability density function of F and IE[S ; C > x]
means IE[SI (C )].
The following result states that under some regularity condition, sensitivities of the form
` (x) typically converges to 0 (in the logarithmic sense) at least as fast as `(x); see also
Nakayama (1996) for related discussion.
Proposition 2.1 Let `(x) = IP (C > x) for some r.v. C , let S be the score w.r.t. and
assume that IEjS jq < 1 for all q < 1. Then ` (x) `(x) .
log
Proof Let 1 = 1=p + 1=q. Then by Holder's inequality, ` (x) `(x)1=pkS kq which implies
that lim inf log ` (x)= log `(x) 1=p . Let p # 1, q " 1. 2
It follows that the estimation of the sensitivity ` (x) is subject to the same problems
concerning relative error as for the performance `(x) itself. Thus IS may be considered, and
it has turned out in many special cases (Kriman & Rubinstein, 1996) that for many queueing
models an asymptotically ecient change of measure for the performance is typically also
asymptotically ecient for the sensitivities. We verify in Corollary 3.1 and Theorem 4.2
8
below that this is also the case in some of the main examples considered in this paper. We
also expect the same to be the case for (3.22), (4.7), (4.9) and (4.11), but have not carried
out the proof; of the remaining estimators, we expect (5.1), (5.3) to be good but not optimal,
whereas (3.12){(3.14) need combination with variance reduction techniques.
assume here that the random variables N; U ; U ; : : : are independent and that the Ui have
1 2
a common light{tailed distribution B , in the sense that IEesUi = B^ [s] exists for suciently
many s > 0. In addition we assume that N is Poisson( ). E.g., the portfolio could consist
of M policy holders each generating claims at rate =M , such that a claim of the ith policy
holder has always size xi ( B being the uniform distribution on fx ; : : :; xM g).
1
In the following subsections we present fast, in fact logarithmic ecient, estimators (see
(2.3)) of `(x) and associated sensitivity estimators w.r.t. a number of distributional and
structural parameters, provided x is large. In particular, the distributional parameters are
in Poisson( ) and some parameter governing the claim size distribution B , while the
structural parameters are x in `(x) , and the parameter a in the excess{of{loss reinsurance
model (3.16) below.
q
which yields = =x so that the pair ( ; ) (the Poisson rate and the exponential
claim rate ) becomes q q
( ; ) = x; =x :
We shall now establish an optimality result somewhat related to a result of Bucklew, Ney
& Sadowsky (1990) who consider optimal simulation of IP(Cn > n(B + )) (but note that
they consider the limit n ! 1 where n is non{random). We rst recall the classical Esscher
approximation for `(x) (Esscher, 1932, Embrechts et al., 1985, Jensen, 1988, 1991, 1995):
`(x) qe
x ' + ( )
: (3.4)
2 B^ 00[]
The conditions for (3.4) require some regularity of the density b(x) of the claims. In partic-
ular, either of the following is sucient:
A. b is gamma{like, i.e. bounded with
b(x) c x e x:
1
1
(3.5)
10
B. b is log{concave, or, more generally, b(x) = q(x)e h x , where q(x) is bounded away from
( )
0 and 1 and h(x) is convex on an interval of the form [x ; x) where x = sup fx : b(x) > 0g.
0
R
Furthermore 1 b(x) dx < 1 for some 2 (1; 2).
0
For example, A covers the exponential distribution and phase{type distributions, B covers
distributions with nite support or with a density not too far from e x with > 1.
Theorem 3.1 Assume that either of A, B holds and that = (x) is chosen according to
(3.3). Then the estimator X (; x) in (3.2) for `(x) satises (2.3), i.e. is logarithmic ecient.
Proof A key ingredient of the proof of (3.4) in the quoted references is to note that the
assumptions imply
where s = supfs : B^ [s] < 1g, and thereby a CLT in IP {distribution for CN , with variance
constant '00() = B^ 00[] and mean x. This motivates heuristically the following proof of
(3.4):
h i
`(x) = IE [W (0 j ); CN > x] = e x ' IE e CN x ; CN > x
+ ( ) ( )
Z 1 p 00 1
e x '
+ ( ) e B y p e y2 = dy
^ [ ] 2
0 2
Z
= qe
x '
+ ( ) 1
e z e z2 = 2B00 dz
(2 ^ [ ])
2 B^ 00[] 0
e x ' Z 1
e z dz = qe
+ ( ) x ' + ( )
q :
2 B^ 00[] 0
2 B^ 00[]
This argument is made precise in, e.g., the references given for (3.4). In just the same way
as in these references, one can rigorously verify the heuristics
h CN x) ;
i
IE e 2 (
CN > x)
11
Z1 p
B^ 00[]y p1 y2 =2 dy
e 2
e
1
0
Z 12 2 2 00
= q e z e z = B dz (8 ^ [ ])
2 2 B^ 00[] 0
Z1
q 1 00 e z dz = q 1 :
2 2 B^ [] 2 2 B^ 00[]
0
Thus we get
h i
IE X (; x) = e
2 2x+2'() IE
e
CN x) ;
2 (
CN > x)
eq x ' :
2 +2 ( )
2 2 B^ 00[]
Comparing this expression with (3.4) shows that all that remains to be veried is
q
log( B^ 00[]) = o (x '()) : (3.7)
where we have used the association inequality IE[f (X )g(X )] IEf (X ) IEg(X ) for increasing
functions (here f (s) = s, g(s) = '00(s) and X is uniform on (0; )). By equation (4.11)
of Jensen (1991), B^ 00[] is of the order of magnitude '0() '() which is O('0() ). Since
2 3
12
3.2 ECM for sensitivities of `(x) = IP(CN > x) w.r.t. distributional
parameters
We consider separately the sensitivities (i) ` (x) and (ii) ` (x), where is a parameter dened
in (3.9) below.
(i) The sensitivity ` (x). Writing
X B n (x)
1 n
`(x) = e ;
n=0 n!
where B = 1 B and B n denotes the distribution of the random variable Cn = Pni Ui , =1
Corollary 3.1 Assume that either of A, B holds and that = (x) is chosen according to
(3.3). Then the estimator X (x) in (3.8) satises (2.3), i.e. is logarithmic ecient.
Proof It is implicit in the proof of (3.4), cf. the references above, that the contribution by
N n is negligible for any xed n < 1. Taking n > 2 (say), we get
" ! #
N
` (x) = IE 1 ; CN > x; N > n + o(`(x))
IP (CN > x; N > n) + o(`(x)) = `(x) + o(`(x)) :
Hence by Proposition 2.1 (note that the Poisson distribution has moment of all orders),
we obtain ` (x) `(x). Further, from the last part of the proof of Theorem 3.1 we have
log
`(x) e x ' .
log + ( )
13
We now get
!
IE X (x) e N 2
1 ce
2 x+2'(x) IE
2 x ' x :
2 +2 ( ) 2
1
Thus, it only remains to check that log = o(x '()). But = B^ [] goes to 1 no
faster than B^ 00[], and from the proof of Theorem 3.1, log B^ 00[] = o(x '()). 2
(ii) The sensitivity ` (x). We consider the particular example where the claim size distri-
bution B belongs to the following two{parameter exponential family
B; (dx) = exp fx + t(x) !(; )g (dx) x > 0: (3.9)
This family covers a number of important cases like Gamma and inverse Gaussian B , and
general multi{parameter exponential families (see Remark 3.1 below). We assume that the
given claim size distribution B = B ; corresponds to = = 0 (this can be achieved by
00
changing (dx) to B (dx); then !(0; 0) = 0) and consider the sensitivity w.r.t. only.
For the relevant LR representation, we rst write the performance `(x) when the claim
size distribution is B ; as
0
h i
IP ; (CN > x) = IE ; eTN N! ; I (CN > x) ;
0 00
(0 )
n o
` (x) = IE (TN N! (0; 0)) exp CN + (e! ; 1) I (CN > x):
( 0)
In this case the rule (3.3) still can be applied and a "good" corresponds to e! ; ! (0; 0)
( 0)
= x.
As an example, assume that both the IS and the claim size random variables are gamma
distributed with parameters (; ) and ( ; ), respectively. In this case the IS density can
0 0
be written as
b(x) = () x e x = exp f x + log x (log () log )g x1
1
14
and similarly the density of the true claim size random variable U . Here (3.9) holds with
0
( )0 = x 0 0
+1
0
( ; B ) = (0) = 0 x0= 0 ; Gamma ; = ( 0 =x) = 0
0
1 ( +1) ( +1)
0 : 0 0
1 ( +1)
Remark 3.1 In a multiparameter exponential family, the exponent of the density has the
form
t (x) + + k tk (x):
1 1
Thus, (3.9) assumes k = 2 and t (x) = x. That it is no restriction to assume one of the ti(x)
1
to be linear follows since the whole set{up requires exponential moments to be nite (thus
we can always extend the family if necessary by adding a term x). That it is no restriction
to assume k 2 follows since if k > 2, we can just x k 2 of the parameters. Finally if
k = 1, the exponent is either x, in which case we can just let t (x) = 0, or t(x), in which
2
Note that a parameter of interest will typically be of the form = ( ; : : :; k ), in which
1
15
3.3 Sensitivities of `(x) = IP(CN > x) w.r.t. structural parameters
We consider separately the sensitivities (i) `x(x) and (ii) `a(x), where a is the retention limit
in the excess-of-loss reinsurance model (3.16) below.
Here we derive an estimator for the derivative `x (x). Note that in this case the estimator
I (CN > x) is not dierentiable w.r.t. x since it contains an indicator function I (x). To
overcome this diculty, we present two approaches, one based on the push{out method and
the second on conditioning.
To apply the push-out method to the parameter x, we rst transfer (push-out) x from
the sample performance `(x) auxiliary pdf, say ~b(y; x), (see (3.11) below) by a suitable
transformation and then use the SF method to estimate `x(x). More specically, we write
`(x) as
`~(x) = IP(U~ + U + + UN > 0);
1 2 (3.10)
where U~ = U x, which has a corresponding auxiliary pdf
1 1
by doing so, the parameter u in (1.1) is \pushed out" from the sample performance `(x)
to the auxiliary pdf ~b(y; x). As a result, x occurs in the new sample performance `~(x) as a
distributional and not structural parameter.
Note that if b(0) > 0 as for the exponential case, ~b(y; x) will have a discontinuity at the
point x depending on the new parameter x. This implies that a modication of the SF
method is needed and is carried out in Asmussen & Signahl (1999). It follows from there
that the derivative `x (x) and the associated estimator can be written as
`x (x) = b(0)IP(U "+ + UN > x) + IE[SU1 ; #U + + UN > x]
2 ~ 2
16
and
`x(x) = b(0) + bb((UU )) I (U + + UN > x);
0 1
1 (3.12)
1
respectively. Here (3.12) can be combined with ECM in the obvious way.
Consider next the conditioning approach. Denote bN and B N the pdf and the cdf of the
random variable CN , respectively. Noting next that for xed N = n 1
Xn ! ( n !)
X
n
B (x) = IP Ui > x = IE B x Ui
i=1 i=2
and (
d IE B x X n !) ( X n !)
dx Ui = IE b x Ui ;
i =2 i =2
we obtain that
X N !" #
`(x) = IEN IP(CN > xjN )I (N 1) = IEN B x Ui ; N 1
i
" XN ! # =2
`x(x) = IEN b x Ui ; N 1 :
i=2
Here the subscript N in IEN denotes the expectation with respect to the random variable
N , b is the probability density function of the random variable U; B = 1 B and (by
convention), Pni Ui = 0 when n = 1. As estimators of `(x) and `x(x) we can take
=2
XN !
X (x) = B x Ui I (N 1) (3.13)
i=2
and
N !
X
Xx (x) = b x Ui I (N 1); (3.14)
i=2
respectively. Note that an important variant of (3.13), where B has a heavy{tailed distribu-
tion, is considered in Asmussen & Binswanger (1997). Note also that, although conditioning
always leads to variance reduction relative to the CMC method, one may be able to further
improve the accuracy of the estimators `(x) and `x (x) by using IS and in particular using
the ECM. Here and in examples below, the simple formula (3.3) for identifying a \good"
reference parameter fails. To overcome this diculty we present in Section 6 a rather
general approach for estimating the optimal reference parameter vector in the IS.
17
(ii) The sensitivity `a(x). This model extends the previous one `(x) = P (CN > x) in the
sense that the part of the claim carried by the cedent (the insurance company) is U ^ a,
(^ denotes minimum) rather than just U . Everything beyond a is covered by the company
insuring at a dierent insurance company, the reinsurer. Assume in addition that the cedent
and the reinsurer apply safety loadings ; where < . Then the premium received by
1 2 1 2
the cedent is
p(a) = [(1 + )IE[U ^ a] ( )IE[U a] ] ;
1 2 1 + (3.15)
(note that this is not the same p() as in (1.7)!) and we are interested in assessing the
probability
`(x) = IP (CN > x + p(a)) = IE fI (CN > x + p(a))g ; (3.16)
of an excessive loss and the associated sensitivity `a (x), where
XN
CN = Ui ^ a:
i=1
Here, estimating `a(x) we face similar problem as estimating `x (x), since the estimator
I (CN > x + p(a)) contains the parameter a in the indicator function, and therefore (similarly
to the problems we met for `x(x)) is not dierentiable w.r.t. a. We suggest a solution based
upon the \push{out" method, proposed by Rubinstein (1992).
Denoting U~ (a) = 1 ^ (U=a) we can represent `(x) as
`(x) = IP C~N (a) > q(a; x) = (q(a; x)) ; (3.17)
where
X
CN (a) = U~i (a); q(a; x) = x +ap(a) ; (y) = IP C~N (a) > y :
N
~
i=1
By straightforward dierentiation w.r.t. a we obtain
`a(x) = a (q(a; x)) qa(a; x): (3.18)
Let b(x) be the density of B (x). Then the random variable U~ (a) has density
8
>
>
< ab(ax); 0 < x < 1
>
f (x; a) = >
>
>
: B (a); x = 1
18
w.r.t. the measure which is a sum of Lebesgue measure dx on (0; 1) and an atom of unit
mass at x = 1. To obtain the desired estimators of `a (x) note that
X1 Z1 Z1 X n !Y n
(y) = IP(N = n) : : : I xi > y f (xj ; a)(dx ) : : : (dxn );1
n 0 0
i j
X1
=1
Z1 Z1 X n
=1
!X n f (x ; a) Y
=1
n
a i
a(y) = IP(N = n) : : : I xi > y f (xj ; a)(dx ) : : :(xn ) :
i f (xi ; a) j
1
n =1 0 0
i =1 =1 =1
Taking into account that qa(a; x) can be calculated analytically, the resulting estimator of
`a(x) is
X ~
`a (x) = a (q(a; x)) qa(a; x) = (q(a; x)) fa(U~ i; a) qa(a; x):
N
(3.19)
i f (Ui ; a)
=1
f (x; a; ) = >
>
>
: F ;a e
a ;
^(
1
x=1
)
19
where is the solution of F^ (; a) = y and the ECM replaces by F^ (; a). The LR is
W = e CN a ' a; ; where '~(a; ) = (F^ (; a) 1):
~ ( )+ ~( )
(3.22)
The LR estimators of (y) and a(y) are a(x)W and `a(x)W , respectively.
MYu e Ti
( )
claims before ruin. Dierentiating w.r.t. and letting = (4.3), we obtain the following
0
!2
IE
X (u) e IE
2
u M ( u) (u) = e 2
u O(u2)
(the last equality follows by general results on moments of rst passage times of random
walks with positive drift, cf. Gut, 1988). From this the result follows. 2
Note that the sensitivity p(u) w.r.t. to the premium rate p can easily be expressed in
terms of (u): if the premium rate is p + h, the time{changed process fRpt= p h g has ( + )
premium rate p and intensity p=(p + h) = (1 h=p + O(h )) so that p(u) = (u)=p.
2
21
4.3 The sensitivity u(u)
To apply the push-out method to the parameter u, we proceed similar as for `x(x) where
`(x) = IP(CN > x) by letting U~ = U u, ~b(y; u) = b(y + u), y u. We can write the
1 1
i =2 1
Xu (u) = e
u
u 4b(0)I @u + pt
( ) 1
(4.7)
i =2 1
(Z u ( )
)
exp
0
(R(t ))dR(t) : (4.8)
When computing sensitivities, we are faced with the problem that two risk processes
with dierent premium rules are not absolutely continuous w.r.t. each other and the LR in
continuous time does not exist. This can be remedied by considering the discrete time Markov
chain fYng formed by the values of the risk process just after claims, Yn = R(T + + Tn ),
1
22
Y = R(0) = u. The dynamics of this Markov chain is the following: let xu (t) be the solution
0
xu(T ) U where T is the rst claim arrival time (exponential with rate ) and U is the
1 1 1 1
Zy
h(y; u) = e gu y gu0 (y); where gu (y) = p(z 1+ u) dz
( )
(gu is the inverse function of xu. i.e. the time needed to go from 0 to y), and it follows that
the CMC estimator of the sensitivity w.r.t. a parameter is
MXu h (x (T ); Y )
( )
Yi 1 i i
h(xYi 1 (Ti); Yi ) I ( (u) < 1) :
1
i =1 1
For the simulation we combine the above equation with (4.8) to get the following LR sensi-
tivity estimator
MXu h (x (T ); Y )
( )
(Z u )
~(u) = Y i i ( )
(R(t ))dR(t) :
h(xYi 1 (Ti); Yi ) exp (4.9)
i 1 1
i
=1 1 0
(i) The premium rule p(x) = p + x where the goal is to compute the sensitivity (u) w.r.t.
the interest rate . Here xu (T ) is Pareto,
1
h(y; u) = (p + (u
p + u)= :
+ y)= +1
(4.10)
and it is straightforward (though tedious!) to compute h(y; u).
(ii) The two{step premium rule (1.7) where the goal is to compute the sensitivity v (u).
Here gu (y) only depends on v if u < v < u + y. If u < v, then
8 8 n o
>
< y >
< exp y y<v u
gu (y) = > v u p1y v u ; h(u; y) = > pn1 v u p1 y v u o
: p + p + : p2 exp p1 p2 y>v u +
1 2
(note for the following that h(y; u) is discontinuous at v u, with a jump of size =
e v u ). Hence
( )
23
>From Asmussen & Signahl (1999), one gets
" X K #
v ( u) = IE e v ( Zi )
i + K ; (u) < 1
i=1
where K is the number of upcrossings of the risk process of level v, Zi the value of the risk
process just after the last claim before the ith upcrossing, and i the indicator that ruin
occurs for the risk process modied by letting the next claim occur instantaneously at the ith
upcrossing. However, this modication has smaller sample paths so that ruin is automatic
on f (u) < 1g. I.e., i = 1 on f (u) < 1g so that
" X K #
v (u) = IE e
v Z i + K ; (u) < 1 :
( )
i=1
Instead of using the local ECM, the simplicity of the two{step premium rule suggest to
use the ECM corresponding to p . I.e., we compute
as solution to (B^ [s] 1) p s = 0,
2 2
and as reference process, simulate a risk process which has arrival intensity B^ [
] and claim
size distribution B
independent of the level x. The resulting LR SF sensitivity estimator is
" X K # 8 < MXu
9
= ( )
Consider rst the IS for the performance alone. An obvious procedure is to estimate
IP(CN > x j M ) just as for the Poisson case, i.e. letting = J , let = (M ) be the
24
solution of B^ 0[] = x and simulate with reference parameters (M ) = B^ 0[] and claim size
distribution B M . Unconditioning, we get the LR estimator for `(x) as
( )
n o
I (CN > x) exp (M )CN + J (B^ [(M )] 1) : (5.1)
Next consider the sensitivities; we shall look at ` (x) and `(x) only. By standard statistical
theory for Markov processes, the LR W = W (; ) (w.r.t. a suitable measure) is
YK h i
W = (pk (; ))Ik (qk (; )) Ik k (; )e k ; Tk e K+1 ; TK ;
1 ( ) ( )(1 )
(5.2)
k=1
where K is the number of jumps in [0; 1], (1); : : : ; (K ) the jump times ((0) = 0, Tk =
(k) (k 1), Ik = 1 if the kth jump is up and Ik = 0 otherwise,
k (; ) = + M ((k) 1);
pk (; ) = (; ; q (; ) = 1 p (; ) :
k k
k )
The sensitivities are ` (x) = IE[I (CN > x)W] and `(x) = IE[I (CN > x)W], where W , W
are obtained by straightforward but tedious dierentiation, and the associated LR estimators
are
n o
I (CN > x) exp (M )CN + J (B^ [(M )] 1) W;
n o
I (CN > x) exp (M )CN + J (B^ [(M )] 1) W :
Note that in this case the variance of the IS estimators contains an extra term arising from
the variability of M . One might therefore potentially obtain a further variance reduction by
changing also the birth{ and death rates. The simplest case is just changing ; to ; ,0 0
where W (; ) is given in (5.2) and similarly for the LR estimators of the sensitivities. There
is no obvious choice of ; but in the following section we shall show how to estimate the
0 0
25
6 Choice of the optimal reference parameter
In this section we show how to choose the reference parameter vector v = ( ; ) in the 0 0 0
We show now how to estimate the optimal solution v of the program (6.1) from simu- 0
lation. Denote f (y; v); v = (; ) and f (y; v ); v = ( ; ) the original and the IS pdfs,
0 0 0 0
respectively. Given a sample Z ; :::; Z T from f (z; v ), we can estimate the optimal solution
1 0
descent or Newton- type recursive algorithms starting, say, from v = v. For more details 0
on solving programs of the type (6.2) see Glynn & Iglehart (1989) and Rubinstein & Shapiro
(1993).
At this end note that Sections 2.4 and 2.6 of Rubinstein & Shapiro (1993) present simple
conditions for L and W under which the optimal solution of the program (6.2), say v N 0
distribution to a normal random vector with mean zero and some covariance matrix given
in equation (2.4.11) of Rubinstein & Shapiro (1993).
26
7 Optimization
In this section we show how to estimate from simulation the optimal parameter a in excess{
of{loss reinsurance model (3.15), (3.16) that is, how to nd the optimal solution a of the
following program
min
a2A
`(a; x) = min
a2A
IE fI (CN > x + p(a))g ; a > 0; (7.1)
To demonstrate CSA, consider program (7.2). The CSA algorithm is dened as follows:
n o
at = at t`a(at; x) ; a 2 A;
+1 (7.3)
27
a 2 A is an arbitrary xed point from A; fag denotes the closest point to a in A (projection
1
quadratic objective, one should adjust the constant C in (7.4) to the \curvature" of the
objective function. Finally, a bad choice of C may decrease the order of convergence, say to
O(t ) with < 1 instead of O(t ). 1
Consider the following modication of the CSA algorithm (see (7.3))
n o
at = at
t`a(at; x) ; a 2 A;
+1 (7.5)
where the step size is chosen as
t = Dp : (7.6)
M t
Here D is the diameter of A, and M satises
IEfk`a(at; x)k g M :
2 2
(7.7)
28
As approximate solutions of the program (7.2) we take the following moving average
t X 1
t
a = [2] ar : (7.8)
t= rt [ 2]+1
We call the algorithm (7.5){(7.8) the robust stochastic approximation (RSA) procedure.
It is readily seen that RSA diers from the CSA in that it uses
Note that Polyak (1990) was the rst who introduced averaging of type (7.8) in a stochas-
tic approximation context. The main dierence between the stochastic approximation of
Polyak (1990) and that of RSA is in item (i); namely the step sizes in the former and in the
latter are
t = O(t ) and
t = O(t = ), respectively. The convergence proof for the RSA
1 1 2
algorithm of type (7.5){(7.8) is given in Nemirovskii & Rubinstein (1999). The convergence
conditions for RSA are automatic in the setting of the program (7.2).
Using again LR we can obtain more accurate estimates of the derivatives `a (at; x) in the
RSA algorithms. Note that the LR estimator of `a(at; x) can be written as `a (at; x)Wt, where
`a(at; x) is given in (7.5) and Wt is the associated LR. Consider again Example 3.1, where
B is exp(). We have (see (3.22))
n o
W = exp C~N (a) + '~(a; ) ; where '~(a; ) = (F^ (; a) 1):
The program (7.2) can be naturally extended to the multidimensional case of a as well
as to constrained optimization. In the rst case one can think of a multidimensional (multi
item) version of the process (1.1), each having dierent claim size distribution, and dierent
Poisson intensities , but, say, the same initial reserve u. In the second case, one can
maximize say the cedent's expected revenue with respect to the vector a, subject to the
constraint that the ruin probabilities (each depending on its own parameter a) being less
than a xed (small) quantity.
29
References
[1] S. Asmussen (1985) Conjugate processes and the simulation of ruin problems. Stoch. Proc.
Appl. 20, 213-229.
[2] S. Asmussen (1987) Applied Probability and Queues. Wiley.
[3] S. Asmussen (1989) Risk theory in a Markovian environment. Scand. Act. J. 1989, 69{100.
[4] S. Asmussen (1999) Ruin Probabilities. World Scientic Publishing Co., Singapore (to appear).
[5] S. Asmussen & K. Binswanger (1997) Simulation of ruin probabilities for subexponential
claims. Astin Bulletin 27, 297{318.
[6] S. Asmussen & H.M. Nielsen (1995) Ruin probabilities via local adjustment coecients. J.
Appl. Prob 33, 736{755.
[7] S. Asmussen & R.Y. Rubinstein (1995) Complexity properties of steady-state rare events
simulation in queueing models. Advances in Queueing: Theory, Methods and Open Problems
(J. Dshalalow, editor), CRC Press, 429-462.
[8] S. Asmussen & M. Signahl (1999) The score function method in the presence of moving
discontinuities. Working paper, Lund University.
[9] J.A. Bucklew, P. Ney & J.S. Sadowsky (1990) Monte Carlo simulation and large deviations
theory for uniformly recurrent Markov chains. J. Appl. Prob. 27, 44{59.
[10] M. Cottrell, J.{C. Fort & G. Malgouyres (1983) Large deviations and rare events in the study
of stochastic algorithms, IEEE Trans. Aut. Control AC{28, 907{920.
[11] M. Devetsikiotis & K.R. Townsend (1993). Statistical optimization of dynamic importance
sampling parameters for ecient simulation of communication networks. IEEE/ACM Trans-
action of Networking 1, 293-305.
[12] P. Embrechts, J.L. Jensen, M. Maejima & J.L. Teugels (1985) Approximations for compound
Poisson and Polya processes. Adv. Appl. Probab. 17, 623-637.
[13] F. Esscher (1932) On the probability function in the collective theory of risk. Skand. Akt.
Tidsskr. 175{195.
[14] M.R. Frater, T.M. Lennon & B.D.O. Anderson (1991). Optimally ecient estimation of the
statistics of rare events in queueing networks. IEEE Trans. on Automat. Contr. AC-36, 1395{
1405.
30
[15] P. Glasserman (1991) Gradient Estimation via Perturbation Analysis. Kluwer, Boston, Dor-
drecht, London.
[16] P. Glasserman & Kou, S.{G.(1995) Analysis of an importance sampling estimator for tandem
queues, ACM TOMACS 5, 22{42.
[17] P. Glynn & Iglehart, D.{L (1989) Importance sampling for stochastic simulation, Management
Science, 35, 11, 1367-1392.
[18] A. Gut (1988) Stopped Random Walks. Springer{Verlag, New York.
[19] P. Heidelberger (1995) Fast simulation of rare events in queueing and reliability models, ACM
TOMACS 6, 43{85.
[20] P. Heidelberger & D. Towsley (1989) Sensitivity analysis from sample paths using likelihoods.
Management Science 35, 1475{1488.
[21] J.L. Jensen (1988). Uniform saddlepoint approximations. Adv. Appl. Probab. 20, 622{634..
[22] J.L. Jensen (1991). Uniform saddlepoint approximations and log{concave densities. J. R.
Statist. Soc. B 53, 157{172..
[23] J.L. Jensen (1995). Saddle Point Approximations. Clarendon Press, Oxford.
[24] I. Kovalenko (1995) Approximations of queues via small parameter method. Advances in
Queueing: Theory, Methods and Open Problems (J. Dshalalow, editor), CRC Press, 481{509.
[25] U. Kriman & R.Y. Rubinstein (1997) Polynomial and exponential time algorithms for esti-
mation of rare events in queueing models. Frontiers in Queueing: Models and Applications in
Science and Engineering (J. Dshalalow, editor), CRC Press, 421-448.
[26] T. Lehtonen & H. Nyrhinen (1992a) Simulating level crossing probabilities by importance
sampling. Adv. Appl. Probab. 24, 858{874.
[27] T. Lehtonen & H. Nyrhinen (1992b) On asymptotically ecient simulation of ruin probabilities
in a Markovian environment. Scand. Actuarial J., 60{75.
[28] M. Nakayama (1996) General conditions for bounded relative error in simulations of highly
reliable markovian systems. Adv. Appl. Probab. 28, 687-727 .
[29] A. Nemirovskii & R.Y. Rubinstein (1999). An ecient stochastic approximation algorithm for
stochastic saddle point problems. SIAM Journal of Optimization, to appear.
[30] G. P
ug & R. Rubinstein (1996) Finding the optimal (s; S ){policy for inventory models from
a single run by the push{out method. Manuscript.
REUVEN
31
[31] Polyak, B.T. (1990). \New method of stochastic approximation type", Automat. Remote Con-
trol, 51, 937{946.
[32] R.Y. Rubinstein (1992) Sensitivity analysis of discrete event systems by the push{out method.
Ann. Oper. Res. 39, 173{195.
[33] R.Y. Rubinstein & A. Shapiro (1993) Discrete Event Systems: Sensitivity Analysis and
Stochastic Optimization via the Score Function Method. John Wiley & Sons, New York.
[34] J.S. Sadowsky (1991) Large deviations theory and ecient simulation of excessive backlogs in
a GI=GI=m queue. IEEE Trans. Automat. Contr. AC-36, 1383{1394.
[35] J.S. Sadowsky (1993) On the optimality and stability of exponential twisting in Monte Carlo
simulation. IEEE Trans. Inf. Th. IT-39, 119{128.
[36] A. Shapiro (1996) Simulation based optimization | convergence analysis and statistical in-
ference Stochastic Models 12 425{454.
[37] M. Van Wouve, F. De Vylder & M. Goovaerts (1983). The in
uence of reinsurance limits
on innite time ruin probabilities. In: Premium Calculation in Insurance (F. De Vylder, M.
Goovaerts, J. Haezendonck eds.). Reidel, Dordrecht Boston Lancaster.
32
Comments on the second revision of MSS9702 by Asmussen & Rubinstein
We have taken care of most of the suggestions and comments as follows:
Editor{in{chief The format is now as requested. To conform with the page length, we
have taken out both sections A,B of the Appendix. A was not really crucial, but B was
referred to. Asmussen is writing a more detailed version (which was needed anyway)
of this material with a student and we have referred to a 'working paper'. If in the
meantime anyone asks for detail, we will just send the original version of the paper.
Pierre L'Ecuyer Everything implemented
Associate editor In view of the page constraints, we have not always been able to supply
the added discussion asked for (but honestly, we do not agree to all points either). Of
the specic points, we have implemented 5{7, 9 and 11{13. We think that the notation
in 10 is clear in view of the reference to Holder's inequality.
Referee 1 In view of page{ and time constraints, the suggestion of extensive practical ex-
periments with the algorithms has obviously not been possible to implement. We,
however, added a paragraph at the end of Section 6, stating that in Sections 2.4, 2.6 of
Rubinstein & Shapiro (1993) one can nd simple conditions for L and W under which
the optimal solution of the program (6.2) converges with probability one (asymptoti-
cally in N ) to v. Validation of the conditions of Sections 2.4, 2.6 for our L and W
0
33