JEANPHILIPPEBOUCHAUDand MARCPOTTERS
CAMBRIDGE
UNIVERSITY PRESS
THEORY OF FINANCIAL RISKS
FROM STATISTICAL PHYSICS TO RISK MANAGEMENT
' There are notable exceptions. such as the remarkable book by J. C. Hull, Futures, Options and Orher
Derivarives, Prenrice Hall, 1997.
See however I. h n d o r , J. Kertesz (Eds): Econophysics, an Emerging Science, Kluwer, Dordrechr (1999): R
Manregna and H. E. Sranley. An Intmducfion io Econophysics, Cambridge University Press i 1999).
...
i'i7:fkc.e Xltl
Fir~urlciirlRisk are devised to account for real markets' statistics where the cnn portfolio. in particular in the c a e ~ k r the
e probability of extreme risks has to be
struction of riskless hedges is in general impossible. The mathematical framework minimized. The problem of forward contracts and options, their optirnal hedge and
required to deal u!ith these cases is however not more complicated, and has the the residual risk is discussed in detail in Chapter 4. Finally. some more advanced
advantage of making the issues at stake, in particular the problem of risk, more topics on options are introduced in Chapter 5 (such as exotic options, or the role of
transparent. transaction costs). Finally, a short glossary of financial terms, an index and a list of
Finally, commercial software packages are being developed to measiire and symbols are given at the end of the book. allowing one to find easily where each
control financial risks (some following the ideas developed in this boo^;).^ We hope symbol or word was used and defined for the first time.
that this book can be useful to all people concemed with financial risk control, by This book appeared in its first edition in French, under the title: Tlz&oria des
discussing at Length the advantages and limitations of various statistical models. Risques Financiers, AlCaSaclayEyrolles; Paris (1997). Compared to. this first
Despite our efforts to remain simple, certain sections are still quite technical. edition, the present version has been substantially improved and augmented. For
We have used a smaller font to develop more advanced ideas, which are not crucial example, we discuss the theory of random matrices and the problem of the interest
to understanding of the main ideas. \xihole sections, marked by a star (*), contain rate curve,.which were absent from the first edition. Furthermore, several points
rather specialized material and can be skipped at first reading. We have tried to be as have been corrected or clarified.
precise as possible, but have sometimes been somewhat sloppy and nonrigorous.
For example, the idea of probability is not axiomatized: its intuitive meaning is
more than enough for the purpose of this book. The notation P ( . ) means the Acknolviedgements
probability distribution for the variable which appeats between the pareniheses. and This book owes a lot to discussions that we had with Rama Gont, Didier Sornette
not a welldetermined function of a dummy variable. The notation x + SQ does (who participated to the initial version of Chapter 3), and to the entire team of
not necessarily mean that x tends to infinity in a mathematical sense, but rather that Science and Finance: Pierre Cizeau, Laurent Laloux, Andrew Matacz and Martin
x is large. Instead of trying to derive results which hold true in any circumstancesi Meyer. We want to thank in particular JeanPierre Aguilar, who introduced us
we often compare order of magnitudes of the different effects: small effects are to the reality of financial markets, suggested many improvements, and supported
neglected, or included perturbative~~.~ us during the many years that this project took to complete. We also thank the
Finally, we have not tried to be comprehensive, and have Left out a number of companies ATSM and CFM, for providing financial data and for keeping us close
important aspects of theoretical finance. For example, the problem of interest rate to the real world. We @so had many fruitful exchanges with Jeff Miller, and also
derivatives (swaps, caps, swapiions ...) is not addressed  we feel that the present with Alain AmCodo, Aubry Miens? Erik Aurell, Martin Baxter, JeanFranlois
models of interest rate dynamics are not satisfactory (see the discussion in Section Chauwin, Nicole El Karoiii, Stefano Galluccio, Gaelle Gego, Giulia Iori, David
2.6). Correspondingly, we have not tried to give an exhaustivelist of references, but Jeammet, Imre Kondor%JeanMichel Lasry. Rosario Mantegna, Marc MCzard,
rather to present our own way of understanding the subject. A certain number of JeanFranqois Muzy, NicoIas Sagna, Farhat Selmi, Gene Stanley. Ray Streater,
important references are given at the end of each chapter, while more specialized Christian Walter, Mark Wexler and Karol Zyczkowsh. We thank Claude Godskche,
papers are given as footnotes where we have found it necessary. who edited the French version of this book, for his friendly advice and support.
This book is divided into five chapters. Chapter 1 deals with important results Finally, J.P.B. wants to thank Elisabeth Bouchaud for sharing so many far more
in probability theory (the Central Limit Theorem and its limitations, the theory of important things.

extreme value statistics, etc.). The statistical analysis of real data, and the empirical  This bouk is dedicated to our families, and more particularly to the memory of
determination of the statisticaI laws, are discussed in Chapter 2. Chapter 3 is Paul Potters.
concemed with the definition of risk, valueatrisk, and the theory of optimal
' For exanrple. the softwarf Pri$ler, commercialized by the company ATSM. heavily reltes on the concepts
Paris, 1999 JeanPhilippe Bouchaud
introduced in Chapter 3. Marc Potters
a a 13 means that a is of order b. a << b means that a i s smaller than, say, b/10. A computation neglecting
terms of order in jb12 is therefore accurate to I %. Such a precision is usually mough in the financial context.
where the uncertainty on ihe value of the parameters (such as the average return. the volatil~t): erc.), Ir often 4
larger than 1%. ' Wiih whom we discussed Eq. (1.241, which appears in h ~ Diplomasbeit.
s
1
Probability theory: basic notions
All epistemologic value of the theory of probabilily is based on this: that large scale
random fihenomena in their collective action crzate srricr, non rmdom regularity.
(Gnedenko and Kolmogorov, Limir Distributions for Sums of lndeperzdent Random
Variables.)
Randomness stems from our incomplete knowledge of reality, from the lack of
information which forbids a perfect prediction of the future. Randomness arises
from complexity> from the fact that causes are diverse, that tiny perturbations
may result in large effects. For over a century now, Science has abandoned
Laplace's deterministic vision, and has fully accepted the task of deciphering
randomness and inventing adequate tools for its description. The surprise is that,
after all, randomness has many facets and that there are many levels to uncertainty,
but, above all, that a new form of predictability appears, which is no longer
deterministic but statistical.
Financial markets offer an ideal testing ground for these statistical ideas.
The fact that a large number of participants, with divergent anticipations and
conflicting interests, are simultaneously present in these markets, leads to an
unpredictable behaviour. Moreover, financial markets are (sometimes strongly)
 affected by external newswhich are, both in date and in nature, to a large degree
8
I unexpected. The statistical approach consists in drawing from past observations
some information on the frequency of possible price changes. If one then assumes
that these frequencies reflect some intimate mechanism of the markets themselves,
then one may hope that these frequencies will remain stable in the course of
time. For example, the mechanism underlying the roulette or the game of dice
a
is obviously always the same, and one expects that the frequency of all possible
outcomes will be invariant in time although of course each individual outcome is
random.
This 'bet' that probabilities are stable (or better, stationary) is very reasonable
in the case of roulette or dice;' it is nevertheless much less justified in the case
of financial markets despite the large number of participants which confer to the
system a certain regularity, at least in the sense of Gnedenko and Kolmogorov.
It is clear, for example, that financial markets do not behave now as they did 30
years ago: many factors contribute to the evolution of the way markets behave
(development of derivative markets, worldwide and computeraided trading, etc.).
As will be mentioned in the following, 'young' markets (sudh as emergent
countries markets) and more mature markets (exchange rate markets, interest rate
markets, etc.) behave quite differently. The statistical approach to financial markets I0
is based on the idea that whatever evolution takes place, this happens sufficiently
slowly (on the scale of several years) so that the observation of the recent past
is useful to describe a not too distant future. However, even this weak stability'
hypothesis is sometimes badly in enor, in particular in the case of a crisis, which
8
Libor 3M Dec 92
marks a sudden change of marker behaviour. The recent example of some Asian 6
currencies indexed to the dollar (such as the Korean won or the Thai baht) is 9206 9208 9210 9212
interesting, since the observation of past fluctuations is clearly of no help lo predict
the amplitude of the sudden turmoil of 1997, see Figure 1 .l.
Hence, the statistical description of financial fluctuations is certainly imperfect.
It is nevertheless extremely helpful: in practice, the 'weak stability' hypothesis is
in most cases reasonable, at least to describe risks?
In other words, the amplitude of the possible price changes (but nottheir sign!)
is, to a certain extent, predictable. It is thus rather important to devise adequate
tools, in order to control (if at all possible) financial risks. The goal of this first
chapter is to present a certain number of basic notions in probability theory, whicfi
we shall find useful in the following. Our presentation does not aim at mathematical
Fig. 1.1. Three examples of statistically unforeseen crashes: the Korean won against the
rigour, but rather tries to present the key concepts in an intuitive way, in order to dollar in 1997 (top): the British 3month shortterm interest rates futures in 1992 (middle),
ease their empirical use in practical applications. and the S&P 500 in 1987 (bottom). In the example of the Kitrean won, it is pasticularly
clear that the distribution of price changes before the crisis was extremely narrow, and
could not be extsapolated to anticipate what happened in the crisis period.
1.2 Probabilities
1.2.1 Probability distributions the fact that price changes cannot actually be smaller than a certain quantitya
Contrarily to the throw of a dice, which can only return an integer between 1 'tick'). In order to describe a random process X for which the result is a real
and 6, the variation of price of a financial asset3 can be arbitrary (we disregard number, one uses a probability density P(x), such that the probability that X is
The idea ihar science ultimately amounts to making tbe best possible guess of reality is due to R. P. Feynman within a small interval of width dx around X = x is equal to P (x) dx. In the
(Seeking New Laws, in The Character ojPhysica1 Laws, MIT Press. Cambridge, MA, 1965). following, we shall denote as P (.) the probability density for the variable appearing
The prediction of future returns on the basis of past returns is however much less justified.
Asset is the generic name for a financial instrument which can be bought or sold, like stocks. currencies, gold, as the argument of the function. This is a potentially ambiguous, but very useful
bonds, etc.
The probability that X is between cr and b is given by the integral of P ( s )
between a and b,
Pa <x <b =
specific values lI and 12 is of course independent of the chosen unit. P (x) dx is thus
invariant upon a change of unit, i.e. under the change of variable x + y x . More
i
generally, P ( x ) dx is invariant upon any (monotonic) change of variable x + y (x): [
in this case, one has P(x) dx = P(y) dy.
In order to be a probability density in the usual sense, P ( x ) must be nonnegative
(P(x) ; 0 for all x ) and must be normalized, that is that the integral of P(x) over
the whole range of possible values for X must be equal to one:
I
k
a
Fig. 1.2. The 'typical value' of a random variable X dmwn according to a distribution
density P ( x ) can be defined in at least three different ways: through its mean value (n),
its most probable value x* or its median x,,d, In the general case these three values are
t distinct.
where x, (resp x M )IS the smallest value (resp. largest) whxh X can take. In the The median xmed is such that the probabilities that X be greater or less than this
case where the posslble values of X are not bounded from below, one takes x, =
1 particular value are equal. In other words, P,(x,,) 4.
= P > ( x , ~ ) = The mean,
m, and similarly for x ~One . can actually always assume the bounds to be i m r or expected value of X, which we shall note as m or (x) in the following, is the
i. average of all possible values of X, weighted by their corresponding probability:
by semng to zero P(x) m the Intervals Im, x,] and [ x M ,m [ Later In the text,
we shall often use the symbol as a shorthand for 1:.
An equ~valentway of descnblng the drstnbut~onof X 1s to conslder ~ t cumula
tive dlstnbutlon P
' , (n), defined as
s
iB
Since the variance has the dimension of x squared, its square root (the RMS, a )
gives the order of magnitude of the fluctuations around in.
t;( z ) S eiZXP (x) dx .
One often uses the third and fourth normalized cumulants, called the skelvness and
kurtosis ( K ) , ~ .
Accordingly, the mean m is the first moment ( a = l), and the variance is related
to the second moment ( a 2 = in2  m2). The above definition, ljiq. X1.7), i p
only meaningful if the integral converges, which requires that P(x) decreases
sufficiently rapidly for large 1x1 (see below). The above definition of cumulants may look arbitrary, but these quantities have
From a theoretical point of view, the moments are interesting: if they exist, their remarkable properties. For example, as we shall show in Section 1.5, the cumulants
knowledge is often equivalent to the knowledge of the distribution P(x) i t s e ~ fIn
.~ simply add when one sums independent random variables. Moreover a Gaussian
One chooses as a reference value the median for the MAE and the mean for the RMS, because for a fixed distribution (or the normal law of Laplace and Gauss) is characterized by the
distribution P ( x ) , these two quantities minimize, respectively, the MAD and the M S . fact that all cumulants of order larger than two are identically zero. Hence the
This is not rigorously correct, since one can exhibit examples of different distribution densiries which possess
exactly the same moments. see Section 1.3.2 below. Note that it is sometimes K + 3, rather than K itself, which is called the kurtos~s.
cuniuiants, in particular K , can be interpreted as a measure of the cl~stancebetween A Gaussian of mean 111 and root lnean square rr i s detined as:
a given distribution P ( x ) and a Gaussian.
P (n)  for x
lxl1+~
t *m,
realized by examining its characteristic function:
then all the moments such that n 2 p are infinite. For example, such a distribution
has no finite variance whenever p 5 2. [Note that, for P(x) to be a normalizable
probability distribution, the integral, Eq. (1.2), must converge, which requires
P '0.1 Its logasithm is a secondorder polynomial, for which all derivatives of order larger
Rze c h a r a c t e r i t i c n c w n of a distribution having an asymptotic powerlaw behaviour than two are zero. In particular, the kurtosis of a Gaussian variable is zero. As
given by Eq. (1.14) is nonanalytic around z = 0. The small z expansion contains regular mentioned above, the kurtosis is often taken as a measure of the distance from a
renns of the fonn z" for n < w followed by a nonanalytic term lzlk (possibly with
logarithmic corrections such as IzIP logz for integer I*). 7ke derivnti~*es of order larger Gaussian distribution. When K > 0 (leptokurtic distributions), the corresponding
or eqzral to w of the characteristicfunction thus do not exist at the origift ( Z = 0). distribution density has a marked peak around the mean, and rather 'thick' tails.
Conversely, when K < 0, the distribution density has a flat top and very thin tails.
For example, the uniform distribution over a certain interval (for which tails are
1.3 Some useful distributions absent) has a kurtosis K = 2.
1.3.1 Gaussian distriBution A Gaussian variable is peculiar because 'large deviations' are extremely rare.
The most commonly encountered distributions are the 'normal' laws of Laplace The quantity exp(x'/2a2) decays so fast for large x that deviations of a few times
and Gauss, which we shall simply call Gaussian in the following. Gaussians are cr are nearly impossible. For example, a Gaussian variable departs from its most
ubiquitous: for example, the number of heads in a sequence of a thousand coin probable value by more than 2a only 5% of the times, of more than 3 a in 0.2% of
the times, whereas a fluctuation of loo has a probability of less than 2 x in
tosses. the exact number of oxygen molecules in the room, the height (in inches)
of a randomly selected individual, are all approximately described by a Gaussian other words, it never happens.
distr~bution.~The ubiquity of the Gaussian can be in part traced to the Central
Limit Theorem (CLT) discussed at length below, which states that a phenonienon
resulting from a large number of small independent causes is GaussLn. There a .
exists however a large number of cases where the distribution describing a complex 1.3.2 Lognormal distribution
phenomenon is nor Gaussian: for example, the amplitude of earthquakes, the Another very popular distribution in mathematical finance is the socalled 'log
velocity differences in a turbulent fluid, the stresses in granular materials, etc., .normal' law. That X is a lognormai random variable simply means that logX
and, as we shall discuss in the next chapter, the price fluctuations of most financial is normal, or Gaussian. Its use in finance comes from the assumption that the
assets. rate of returns, rather than the absolute change of prices, are independent random
' Altkhough. in the above three examples, the random variable cannot be negative. As we shall discuss below. the variables. The increments of the logarithm of the price thus asymptotically sum
Gaussian description is generally only valid in a certain neighbourhwd of the maximum of the distribution. to a Gaussian, according to the CLT detailed below. The lognormal distribution
density is thus defined as:'
magnitude of the large (positive or negative) fluctuations of x. For instance. the
probability to draw a nilmber larger than s decreases as P,, (A) = (A,/x)/[ for
large positive x .
One can of course in principle observe Pareto tails with p >_ 2; but, those toils
do not correspond to the asymptotic hehaviour of a Levy distribution.
In full generality, Levy distributions are characterized by an asyrnmerry pariitn
eter defined as B = (A:  A ! ) / ( A : +
A!), which measures the relative weight
of the positive and negative tails. We shall mostly focus in the following on the
symmetric case fi = 0. The fully asymmetric case (fi = 1) is also useful to describe
strictly positive random variables, such as, for example, the time during which the
price of an asset remains below a certain value, etc.
An important consequence of Eq. (1.14) with p 5 2 is that the variance of a
lkvy distribution is formally infinite: the probability density does not decay fast
enough for the integral, Eq. (1.6), to converge. In the case p 5 1, the distribution
Fig. 1.4. Shape of the symmetric Uvy distributions with p = 0.8, 1.2, 1.6 and 2 (this
density decays so slowly that even the mean, or the MAD, fail to exist.'"he
last value actually corresponds to a Gaussian). The smaller p,the sharper the 'body' of the
scale of the fluctuations, defined by the width of the distribution, is always set by distribution, and the fatter the tails, as illustrated in the inset.
A = A + = A_.
There is unfortunately no simple analytical expression for symmetric Levy
distributions L, ( x ) , except for p = 1, which corresponds to a Cauchy distribution It is important to notice that while the leading asymptotic term for large x is
(or Xorentzian'): given by Eq. (1.18), there are subleading terns which can be important for finite x.
The full asymptotic series actually reads:
where a , is a certain constant, proportional to the tail parameter A".I2 It is thus The presence of the gubleading terms may lead to a bad empirical estimate of
clear that in the limit p = 2, one recovers the definition of a Gaussian. When the exponent p based on a fit of the tail of the distribution. In particular, the
p decreases from 2, the distribution becomes more and more sharply peaked 'apparent' exponent which describes the function L, for finite x is larger than
around the origin and fatter in its tails, while 'intermediate' events lose weight p, and decreases towards p for x t oo, but more and more slowly as f i gets
(Fig. 1.4). These distributions thus describe 'intermittent' phenomena, very often nearer to the Gaussian value p =.2, for which the powerlaw tails no longer exist.
small, sometimes gigantic. Note however that one also often observes empirically the opposite behaviour, i.e.
Note finally that Eq. (1.20) does not define a probability distribution when p+  a. : . an apparent Pareto exponent which grolvs with x . This arises when the Pareto
2, because its inverse Fourier transform is not everywhere positive. distribution, Eq. (1.18), is only valid in an intemlediate regime x << l / a , beyond
171 the case /3 f 0, one wozcld have: which the distribution decays exponentially, say as exp(ax). The Pareto tail is
then 'truncated' for large values of x, and this leads to an effective p which grows
with x .
An interesting generalization of the L6vy distributions which accounts for this
l 1 The mzdian and the most probable value however still exist. For a symmetric U v y distribution, the most
probable value defines the socalled 'localization' parameter m . exponential cutoff is given by the 'truncated L6.r.y distributions'(JTLD), which will
l 2 For example, when I c f i < 2 , A&  j r T ( ~ 1) sin(np/2)n,/n. be of much use in the following. A simple way to alter the characteristic function
Eq. (1.20) to account for an exponential cutoff for large arguments 1s to set:" kind. For .s small compared to xo, PH(x) behaves as a Gaussian althougl~its
i
asymptotic hehaviour for x >> .I$ is fatter and reads exp(a 1s 1 j.
From the characteristic function
for 1 5 I*. 5 2. The above form reduces to Eq. (1.20) for a = 0. Note that the
argument in the exponential can also be written as:
we can compute the variance
,&.

More generally, the value A p which is greater than x,, with probability p is given
by
1% P
P>(Ap)  7. (1.38)
The quantity A, defined by Eq. (1.34) above is thus such that p = l / e r 0.37.
The probability that x is even larger than A,, is thus 63%. As we shall now
show, A, also corresponds, in many cases, to the mostprobable value of x,,,.
Equation (1.38) will be very useful in Chapter 3 to estimate a maximal potential
loss within a certain confidence level. For example, the largest daily loss A
expected next year, with 95% confidence, is defined such that P,(A) =
Fig. 1.5. Probability density for the truncated L6vy ( p = g), Student and hyperbolic
1og(0.95)/250, where P, is the cuhulative distribution of daily price changes,
distributions. All three have two free parameters which were fixed to have unit variance
and kurtosis. The inset shows a blowup of the tails where one can see that the Student and 250 is the number of market days per year.
distribution has tails similar to (but slightly thicker than) those of the truncated Levy. Interestingly, the distribution of x, only depends, when N is large, on the
asymptotic behaviour of the distribution of x, P(x), when x + oo.For example,
The law of large numbers tells us that an event which has a probability p of if P(x) behaves as an exponential when x + oo,or more precisely if P, (x)
exp(ax), one finds:

occurrence appears on average N p times on a series of N observations. One thus
expects to observe events which have a probability of at least 1/N. It would be log N
A", = ,
surprising to encounter an event which has a probability much smaller than 1/N. a
The order of magnitude of the largest event, A,,, observed in a series of N which grows very slowly with N.14 Setting x, = A, + ( u / a ) ,one finds that
independent identically distributed (iid) random variables is thus given by: the deviation u around A, is distributed according to the Gurnbel distribution:
P,(A,,) = 1/N. (1.34)
More precisely, the full probability distribution of the maximum value xma, = The most probable value of this distribution is u = 0.15 This shows that A,
m a ~ ; , ~ , ~ { isx ~relatively
), easy to characterize; this will justify the above simple is the most probable value of x,,. The result, Eq. (1.40), is actually much more
criterion Eq. (1.34). The cumulative distribution'P(x,, i A) is obtained by general, and is valid as soon as P(x) decreases more rapidly than any powerlaw
noticing that if the maximum of all xi's is smaller than A, all of the xi's must for x + oo:the deviation between Amax(defined as Eq. (1.34j) and x,, is always
be smaller than A. If the random variables are iid, one finds: distributed according to the Gumbel lawi Eq. (1.40): up to a scaling factor in the
definition of u.
The situation is radically different if P(x) decreases as a powerlaw, cf.
Note that this result is general, and does not rely on a specific choice for P ( x ) . Eq. (1.14). In this case,
When A is large, it is useful to use the following approximation:
P(x,, < A ) = [I  P , ( n ) l N ~e~~>('). (1.36) l4 For example. for a symmetric exponential distribution P ( . x ) = exp(lr/)/2,the median value of the
maximum of N = IOOOO variables is only 6.3.
I5 This d~slributionis d~scussedfunher m the context of financial risk control i n Section 3.1.2, and drawn in
Since we now have a simple formula for the distrfbution of x,, one can invert Figure 3.1
;il,il the typical value of the ~ l i a x i n ~ uis~given
n hy: The width n ~ ,of
, tire dis~ributionis forrnd to be given by:
study the position A*[rz] of the maximum of PI,, and also the width of P,,, defined
fro111the second derivative of log P,, calculated at A * [ n l . The calculation srmpfities " :
I
j
i n the hmit where N + W , n + W, with the ratio n / N fixed. In this limit, one then, if P ( x ) decreuses as rn Ey. ( 1 .I4j, onefinds,for A + m,
I
rinds a relation which generalizes Eq. (1.34): I
PI( X I PZ(x
)  XI), from which one obtains:
Fig. 1.6. Amplitude versus rank plots. One plots the value of the nth variable A [ n ] as a
function of its rank n. If P ( x ) behaves asymptotically as a powerlaw, one obtains a straight
line in loglog coordinates, with a slope equal to  l / ~ .For an exponential distribution.
one observes an effective slope which is snlalier and smaller as A;/n tends to infinity. The This equation defines the convolution between PI( x ) and P ~ ( x ) which
, we shall
points correspond to synthetic time series of length 5000, drawn according to a powerlaw write P = PI * P2. The generalization to the sum of N independent random
with p = 3, or according to an exponential. Note that if the axes x and y are interchanged,
then according to Eq. (1.45). one obtains an estimate of the cumulative distribution, F,.
1
+ + +
variables is immediate. If X = X1 X 2 . .  XN with X idistributed according
to Pi ( x i ) ,the distribution of X is obtained as:
In order to describe the statistics of future prices of a financial asset, one a One thus understands how powerful is the hypothesis that the increments are iid,
priori needs a distribution density for all possible time intervals, corresponding I i.e. that PI  P2 = . . . = PlV.Indeed, according to this hypothesis, one only needs
to different trading time horizons. For example, the distribution of 5min price I to know the distribution of increments over a unit time interval to reconstruct that
fluctuations is different from the one describing daily fluctuations, itself different of increments over an ifiterval of length N: it is simply obtained by convoluting the
for the weekly, monthly, etc. variations. But in the case where the fluctuations are elementary distribution N times with itself.
independent and identically distributed (iid), an assumption which is, however, 1i The aizulyrical or nlimerical marzipulationsof Eys (1.50)and (1.51)are inuch eased by the
usually not justified, see Sections 1.7 and 2.4, it is possible to reconstruct the
i rue of Fourier t m s f o n n s , for which convolutio~~s
becorne simple products. The rylration
distributions corresponding to different time scales from the knowledge of that
describing short time scales only. In this context, Gaussians and Levy distributions
play a special role, because they are stable: if the short time scale distriblrtiorris
a stable law, thcn the fluctuations on all time scales ate described by the same
. a; :
I
1
P ( x , N = 2) = [ P I+ P2j(x),reads in Fourielsp~rce:
jjjz,),I= 2) =
I I
e ~ : ( ~  ~ ~ J P:
+ (x')
~ f p7(x
~  dx.' dr  
Pi (:) P2(i).
A
11.52)
If7 order to obtain the Nth convolution of u functioii ~  i f hitself; one should raise its
stable law only the parameters of the stable law must be changed (in particular its i chamcteristic futlcrion to the power N , and then tuke irs inverse Fourier trunsfoml.
width). More generally, if one sums iid variables, then, independently of the short
tlme distribution, the law describing long times converges towards one of the stable
laws: this is the content of the 'central limit theorem' (CLT). In practice, however, 1.5.2 Additivity of cumulants and of tail amplitudes
this convergence can he very slow and thus of limited interest, in particular if one It is clear that the mean of the sum of two random variables (independent or not)
is concerned about short tirne scales. is equal to the sum of the individual means. The mean is thus additive under
convolution. Similarly, if the random variables are i~idrprndent,one can s l i c ~that laws coincide:
their variances (when they boih exist) arc also additive. More generally. all the
cumulants (c,,) of two independent distributions simply add. This follows fro111 P ( x , N)6\= PI dr! where x = L~,VYI $ O N . ( 13 5 )
the fact that since the characteristic functions multiply. their logarithm add. The
additivity of cumulants is then a simple consequence of the linearity of derivation. The distribution of increments on a certain time scale (week, month, year) is thus
The cumulants of a given law convoluted N times with itself thus follow the sraie invariant, provided the variable X is properly resealed. In this case, the
simple rule c , , ~= Nc,,,, where the {c,,,~) are the cumulants of the elementary chart giving the evolution of the price of a financial asset as a function of time
distribution P I . Since the cumulant c , has the dimension of X to the power n , its has the same statistical structure, independently of the chosen elementary time
relative importance is best measured in terms of the normalized cun~ulants: scaleonly the average slope and the amplitude of the fluctuations are different.
These charts are then called selfsimilar, or, using a better terminology introduced
by Mandelbrot, selja@ne (Figs 1.7 and 1.8).
The family of all possible stable laws coincide (for continuous variables) with
The normalized cumulants thus decay with N for tz > 2: the higher the cumulant? the Levy distributions defined above," which include Gaussians as the special
the faster the decay: :X C( N1"/'. The kurtosis K , defined above as the fourth case p = 2. This is easily seen in Fourier space, using the explicit shape of
normalized cumulant, thus decreases as 1/N. This is basically the content of the characteristic function of the Levy distributions. We shall specialize here for
the CLT: when N is very large, the cumulants of order > 2 become negligible. simplicity to the case of symmetric distributions Pl(xl) = PI(xl), for which
Therefore, the distribution of the sum is only characterized by its first two the translation factor is zero (bN 0). The scale parameter is then given by
cumulants (mean and variance): it is a Gaussian. U N = N1/'/,19and one finds, for p < 2:
Let us now turn to the case where b e elementary distribution PI ( X I )decreases '
as a powerlaw for large arguments x l (cf. Eq. (1.14)),with a certain exponent
p. The cumulants of order higher than p are thus divergent. By studying the
small z singular expansion of the Fourier transform of P(x, N), one finds that
where A = A, = A _ . In words; the above equation means that the order of
the above additivity property of cumulants is bequeathed to the tail amplitudes A: magnitude of the fluctuations on 'time' scale N is a factor N1!'* lager than the
the asymptotic behaviour of the distribution of the sum P(x, N) still behaves as a fluctuations on the elementary time scale. However, once this factor is taken into
poweriaw (which is thus conserved by addition for all values of I*, provided one account, the probability distributions are identical. One should notice fhe smaller
takes the limit x + oo bejore N + oo  see the discussion in Section 1.6.3),with
the value of p , the faster the growth of fluctuations with time.
a tail amplitude given by:
The tail parameter thus plays the role, for powerlaw variables, of a generalized 1.6 Central limit theorem
cumulant.
U'e have thus seen that the stable laws (Gaussian and Levy distributions) are 'fixed
points' of the convolution operation. These fixed points are actually also attracrors,
. . in the sense that any distribution convoluted with itself a large number of times
1.5.3 Stable dishihutinns and selfsiimilariQ
finally converges towards a stable law (apart from some very pathological cases).
If one adds random variables distributed according to an arbitrary law P1(xl), Said differently, the limit distribution of the sum of a large number of random
one constructs a random variable which has, in general, a different probability variables is a stable law. The precise formulation of this result is known as the
distribution ( P ( x , N) = [ ~ ~ ( x ~ However,
f r ~ ) , for certain special distributions, central limit rheorem (CLT).
the law of the sum has exactly the same shape as the elementary distribution  these
are called stable laws. The fact that two distributions have the 'same shape' means IS. For discrete variables, one should also add the Poisson distribution EP.(1.27).
that one can find a (Ndependent) translation and dilation of x such that the two l 9 The case fi = 1 is special and iwolves exna logarrthmlc factors.
Fig. 1.8. In this case, the elementary distribution PI (qj decreases as a powerlaw with an
exponent p = I .5. The Gale factor i s now given by uw = IV'i3. Note that. contrarily to
Fig. 1.7. Example of a selfaffine function, obtained by summing random variables. One
the previous graph, one clearly observes the presence of sudden 'jumps', which reflect the
plots the sum x as a function of the number of terms N in the sum, for a Gaussian
existence of very large values of the elementary increment .xi.
elementary distribution Pi(sl).Severai successive ,zooms' reveal the selfsimilar nature
of the function, here with UN = hill2.
from the Gaussian prediction; but the weight of these nonGaussian regions tends
1.6.1 Convergence to a Gaussian to zero when AT goes to infinity. The CLT only concerns the central region, which
The classical formulation of the CLT deals with sums of iid random variatilesaf . keeps a finite weight for AT large: we shall come back in detail to this point below.
finite liariaizce g 2 towards a Gaussian. In a more precise way, the result is then the The main hypotheses ensuring the validity of the Gaussian CLT are the follow
following: ing:
of a n . around the mean value of X. The actual width of the region where It is important to realize that the convohrion operation is 'information burning', since
the Gaussian turns out to be a good approximation for large finite N crucially all the details of fhe elementaty distribution PI ( X I ) progressively disappear while the
depends on the elementary distribution Pl(xl). This problem will be explored in Gaussian distribution emerges.
Section 1.6.3. Roughly speaking, this region is of width  N3j4a for 'narrow'
symmetric elementary distributions, such that all even moments are finite. This
region is however sometimes of much smaller extension: for example, if PI (xr) 2.62 Convergence to a Le'vy distribution
has powerlaw tails with p > 2 (such that o is finite), the Gaussian 'realm' Let us now turn to the case of the sum of a large number N of iid random
grows barely faster than f i (as .), variables, asymptotically distributed as a powerlaw with p < 2, and with a tail
amplitude A@ = A: = A! (cf. Eq. (1.14)). The variance of the distribution is
The above ,fornlrrlarior? of rile CLT requires the existence qf a fuzite variance. This thus infinite. The limit distribution for large N is then a stable E v y distribution
co~tdirioncan be sonze~,hatweakened to include some 'rnurgitiul' disiriburiorzs s~ichas
a powerlaw with /* = 2. In this case the scale factor is ?lor rr,v = d% but rather of exponent ,LL and with a tail amplitude hrA@.If the positive and negative tails
aN = d m . No~'ever,as we shall discuss in the tlert section, elen~entaqdiszrihutims of the elementary distribution Pl(xl) are characterized by different amplitudes
which decay r~zoreslowl?~than lxlr3 do nor belong the the Gaussian basirz ~fattractiorz. (A! and A:) one then obtains an asymmetric Ltvy distribution with parameter
More precisely the necessary and suXcient conditionfor PI(xi) to belong ra this basin is +
p = (A:  Ac)/(Az A!). If the 'left' exponent is different from the 'right'
tlzat: exponent ( p  f p + ) , then the smallest of the two wins and one finally obtains
lirn u2
I' (u) t PI>(^) = O, a totally asymmetric LCvy distribution ( p =  1 or ,8 = 1) with exponent
i*+m ui2P I (ui)du' p = min(p, p + ) . The CLT generalized to Lkvy distributions applies with the
same precautions as in the Gaussian case above.
This condition is always satisfied if the variance is jnite, bur allows one to include the
inargincil cases such 0.7 u powerlaw wirlz /* = 2. Note that entropy is defined up to an additive constant. It is conirnon to add I to the above definition
all the moments are finite. In this case, all the cumulants of Pi are finite and one
can obtain a systeinatic expansion in powers of N'/'of the difference dP,(u) .
'P>(u) P G > ( ~ ) :

which according to the CLT tends towards a Gaussian variable of zero mean and
unit variance. Hence, for anyfied u , one has: This shows that the central region has an extension growing as I V * ~ ~ .
hm P,( 1 4 ) = &> (u), (16 7 ) A symmetric elen~entarydistribution is such that h3 0; it is then the kurtosis
N+m K = A4 that fixes the first correction to the Gaussian when N is large, and thus the
where PG,(u) is the related to the error function, and describes the weight extension of the central region. The conditions now read: N >> N* = h4 and
contained in the tails of the Gaussian:
" 1
P G > ( u )= exp(u2/2) du' = i e d c (1.68)
The central region now extends over a region of width N3I4.
However, the above convergence is not unijomz. The value of N such that the
Tlre results of the present section do not directly apply if the elementary
approximation P, (u) rr PG,( u ) becomes valid depends on u. Conversely, for
distribution Pl(.xl) decreases as a powerlaw ('broad distribution'). In this case,
fixed N , this approximation is only valid for u not too large: lul << uo(N).
some of the cumulants are infinite and the above cumulant expansion, Eq. (1.69)1is
One can estimate uo(N) in the case where the elementary distribution Pl(xl) is
'narrow', that is, decreasing faster than any powerlaw when 1x1[ 4 oo,such that The ab0r.e argurnenrs can actuaIly be made fully rigorous, see [FelIer].
30 tizeoij,: b o s r ~ ~lntiori
Prol~ol~iliiy .
ri~raningless.In the next section, we shall see Orat in this case ttic 'central' region computation shows that the above distribution is con.ectly normalized, has a meail
i s much more restricted than in the case of 'narrow' distribuiions. We shall then given by ~ 7 7= cr! and a variar~cegiven by a' = a'. Furthermore, the exponential
describe in Section 1.6.5, the case of 'truncated' powerlaw distributions. where the distribution is asymmetrical: its skewness is given by c3 = ( ( x  tn)" = 22a', or
above conditio~~s become asymptotically relevant. Tl~eselaws however may have h j = 2;
a very large kurtosis, which depends on the point where the truncation beconlcs The sun1 of N such variables is distributed according to the Nth convolution
noticeable, and the above condition N >> i14 can be hard to satisfy. of tl~eexponential distribution. According to tl~eCLT this distribution should
approach a Gaussian of mean n7N and of variance h7a'. The Nth convolution of
Crumdrfunction I thzexpone%tial' distribution can be computed exactly. The result is:23
M o ~ ugenerally, when N is large, one can write the distribution of the sli~nof N iid random ! &'I
e a2
~,ariablesas:22
I);(
I
P(s,N ) =@(X)CXN~
P(X, 'v) [NS . (1.74) i ( N  l)! '
which is called a 'Gamma' distribution of index IV. At first sight, this distribution
uhere S is tlze socalled Crutntr function, which gives some inforrr~ntionabout the
probability o X er~enoutside the 'central' region. When the variance is fmite, S grows
if I does not look very much like a Gaussian! For example, its asymptotic behaviour is
as S ( u ) ix u for s~nnllu 's, which again leads io u Gaassian cetztrai regiwf Fnrjnire u, S t very far from that of a Gaussian: the 'left' side is strictly zero for negative x: while
can be computed using Luplace 's saddle point method, valid for N large. By definition: t
! the 'right' tail is exponential, and thus much fatter than the Gaussian. It is thus
very clear that the CLT does not apply for values of x too far from the mean value.
However, the central region around NIT?= NCX'is well described by a Gaussian.
il/he~zN is large, the ubnve integral is dominated by the neighbourlzood of the point z* The most probable value (x*)is defined as:
where the tetm in the exponential is stationary The results can be written as: I
'? We Wesrumc that their mean is zero. which can always be achieved through a suitable shift of %. 1 23 This result can be shown by iriduction using the definition (Eq. 11 ji)!!
i
to subleading terms in . r h p ' ) with the asymptotic behaviour of the elementary The above two expressions Eqs ( 1 .EX) and ( 1.89) are not incompatible, since these
distribution PI ( x , ) . describe two very different regions of the distribution P ( x . A'). For fixed N, there
Another very instructive example is provided by a distribution which behaves is a characteristic value xo(N) beyond which the Gaussian approximation for
as a powerlaw for large arguments, but at the same time has a finite variance to P ( x . N) is no longer accurate, and the distribution is described by its asymptotic
ensure the validity of the CLT. Consider the following explicit example of a Student powerlaw regime. The order of magnitude of xo(N) is fixed by looking at the point
distribution with y = 3: where the two regimes match to one another:
A'ote that the 1z13 singularit)..(1t:hichsignals the divergence oftire nzoments In, for PI > 3)
does nor disappear urzder convolution, even ifnt the same tinre P ( s . N )converges towards
which indeed goes to zero for large N ,
the Gaussian. The resolution of this apparent pamdox is again that the convergence The above arguments are not special to the case p = 3 and in fact apply more
towards the Gaussian only concerns the centre of the distribution, whereas the tail in . x  ~ generally, as long as ,LL > 2, i.e. when the variance is finite. In the general case, one
survives for ever (as was mentioned in Section 1.5.3). finds that the CLT is wlid in the region Ix 1 << xo oc JN log N, and that the weight
As follows from the CLT, the centre of P ( x , N ) is well approximated, for N of the nonGaussian tails is given by:
large, by a Gaussian of zero mean and variance Nu2:

which tends to zero for large N. However, one should notice that as y approaches
On the other hand, since the pourerlaw behaviour is conserved upon addition and the 'dangerous' value p = 2, the weight of the tails becomes more and more
that the tail amplitudes simply add (cf. Eq. (1.14)), one also has, for large x's: important. For y i2, the whole argument collapses since the weight of the tails
would grow with N. In this case, however, the convergence is no longer towards
the Gaussian, but towards the LCvy distribution of exponent p.
1.6.5 Truncated Levy distributio~ls
An inieresting case is when the elementary distribution P l ( x l ) is a truncated
L i v y distribution (TLD) as defined in Section 1.3.3. The first cumulants of the
distribution defined by Eq. 11.23) read, for 1 ip i2:
where S is the Dirac function. We shall also need the socalled 'resolvent' G(h) of
the matrix H, defined as:
where KO is the kurtosis of the variable c, and g ( l ) the correlation function of the
variance, defined as:
where 1 is the identity matrix. The trace of G(h) can he expressed using the
eigenvalues of )Ias:
It is interesting to see that for N = 1, the above formula gives K I = KO (3 + +
rco)g(0) > KO, which means that even if KO = 0, a fluctuating volatility is enough to
produce some kurtosis. More importantly, one sees that if the variance correlation
function g(&)decays with 0, the kurtosis K N tends to zero with N ,thus showing The 'trick' that allows one to calculate p(h) in the large M limit is the following
that the sum indeed converges towards a Gaussian variable. For example, i T g ( & ) representation of the 6 function:
decays as a powerlaw t" for large t, one finds that for large N:
1 1
= PP f inJ(x) (E + 01% (1.109)
1 1 x  ic x
KA,O( for v>1; UNM for v i l . (1.105)
N N" where P P means the principal part. Therefore, p(h) can he expressed as:
" Note that for i # j thir correlation function can be zero either because o is identically equal to a certain value 1
00. or because the Ructuat!ons of o are cornpletely uncorrelated from one time to the next.
p(h) = lim 3 (TrG(h 
r+O Mn
Our t;~sk is therefore to obtain an expression for the resolvent G ( A ) .This can The \c~lutionto this second(11derequation reads:
be done by establishing a recursion relation. allowing one to ccmlpute G ( h ) for
a matrix N with one extra row and one extra column, the elements of which being
No,. One then computes G&+' (hj (the superscript stands for the size of the n~atrix
H ) using the standard formula for matrix inversion: (The correct solution is chosen to recover the right limit for a = 0.) Now, the only
way for this quantity to have a nonzero imaginary part when one adds to P. a ssrnall
imaginary term i t which tends to zero is that the square root itself is imaginary.
The fim&result for the density of eigenvalues is therefore:
Now. one expands the determinant appearing in the denominator in minors along
the first row. and then each minor is itself expanded in subminors along their first
column. After a little thought, this finally leads to the following expression for
and zero elsewhere. This is the wellknown "semicircle' law for the density
G&+I (A):
of states, first derived by Wigner. This result can be obtained by a variety of
other methods if the distribution of matrix elements is Gaussian. In finance, one
often encounters correlatio~znzatrices C, which have the special property of being
positive definite. C can be written as C = HH', where H* is the matrix transpose of
This relation is general, without any assumption on the Hij. Now, we assume that H. In general, H is a rectangular matrix of size M x N:so C is M x M. In Chapter
the Hij's are iid random variables, of zero mean and variance equal to (Hz.) = 2, M will be the number of assets, and N , the number of observations (days). In
0 2 / M . This scaling with M can be understood as follows: when the matrix N the particular case where N = M. the eigenvalues of C are simply obtained from
acts on a certain vector, each component of the iinage vector is a sum of M random those of H by squaring them:
variables. In order to keep the image vector (and thus the corresponding eigenvalue) 2
hc = AH. (1.117)
finite when M + m, one should scale the elements of the matrix with the factor
I/ a . If one assumes that the elen~entsof H are random variables. the density of
One could also write a recursion relation for G":, and establish self eigenvalues of C can be obtained from:
consistently that G, 1/*  for i # j . On the other hand, due to the diagonal
p ( A c ) dhc = 2p(hH)dhH for AH > 0. (1.1 18)
term A. Gii remains finite for M + x.This scaling allows us to discard all
the terms wit11 i $ j in the sum appearing in the righthand side of Eq. (1.112). where the factor of 2 comes from the two solutions hH = kfi;this then leads

Furthermore, since Noo l / a , this tenn can he neglected compared to A. This  to:
finally leads to a simplified recursion relation. valid in the limit M +m:
p(hc) =  for 0 5 kc 5 4a2, (1.119)
,
GZ+~
I
 A  1H;~G;(J.).
M 2na2 hc
(A) i=l and zero elsewhere. For N # M , a similar formula exists, which we shall use in
Now, using the CLT. we know that the last sun1 converges, for large M , towards the following. In the limit N . M + oo,with a fixed ratio Q = N / M 2 1. one
c2/M xEIG;'(h). This result is independent of the precise statistics of the Hoi, has:29
provided their variance is finite." This shows that Goo converges for large M
towards a welldefined limit C,, which obeys the following limit equation:
equivalently o
' is the average eigenvalue of C. This form is actually also valid must estimate ((Zr, One finds:
for Q < 1, except that there appears a finite fraction of strictly zero eigenvalues,
of weight 1  Q (Fig. 1.10).
The most important features predicted by Eq. (1.120) are:
Gathering the different terms and using the definition Eq. (1.121). one finally
* The fact that the lower 'edge' of the spectrum is positive (except for Q = 1); establishes the following general relation:
there is therefore no eigenvalue between 0 and Amin. Near this edge, the density
of eigenvalues exhibits a sharp maximum, except in the limit Q = 1 (Ami, = 0)
where it diverges as  I /&.
(1.124)
e The density of eigenvalues also vanishes above a certain upper edge A,,. or:
Note that all the above results are only valid in the limit N + oo.For finite N,
the singularities present at both edges are smoothed: the edges become somewhat
blurred, with a small probability of finding eigenvalues above ,A and below A,
which goes to zero when N becomes large.30 1.10 Appendix B: density of eigenvalues for random correlation matrices
In Chapter 2, we will compare the empirical distribution of the eigenvalues of the This very technical appendix aims at giving a few of the steps of the computation
correlation matrix of stocks corresponding to diRerent markets with the theoretical needed to establish Eq. ( I . 120). One starts from the following representation of the
prediction given by Eq. (1.120).
30 See e.g. M. I. Bowick, E. BrCzin, Universal scaling of the tails of the density of eigenvalues i n random matrix
models. Pli?sics Le:ters. B268, ?! (1991).
resolvent G ( A ) :
G(h)= x 
1
A
a
ah.
a
 =  log n ( A  A,) =  log det(h1  C)
i~A
 a
Z(i;).
ah
of a symmetrical inatsix .I:
tising the I?~llowingrepresentation for the dster~ili~iant where Q = N / M . The integrals over :iind (/ are performed by the badiile pclinl
method. leading to the following equations:
The claim is that the quantity GJA) is selfaveraging in the large M limit. So in We find G ( h ) by differentiating Eq. (1.131) with respect to A. The computation is
order to perform the computation we can replace G (il) for a specific realization of greatly simplified if we notice that at the saddle. point the partial derivatives with
H by the average over an ensemble of H. In fact, one can in the present case show respect to the functions y (A) and z(h) are zero by construction. One finally finds:
that the average of the logarithm is equal, in the large N limit, to the l o g a r i t w of
the average. This is not true in general, and one has to use the socalled 'replica
trick' to deal with this problem: this amounts to taking the nth power of the quantity
to be averaged (corresponding to n copies (replicas) of the system) and to let n go We can now use Eq. (1,110) and take the imaginary part of G ( h )to find the density
to zero at the end of the c o m p ~ t a t i o n . ~ ' of eigenvalues:
We assume that the M x N variables Hik are iid Gaussian variable with zero
mean and variance 0 2 / N . To perform the average of the Hik, we notice that the
integration measure generates a term exp M ff;/(2oZ) that combines with
the H i k H j k term above. The summation over the index k doesn't play any role which is identical to Eq. (1.120)
and we get A' copies of the result with the index k dropped. The Gaussian integral
over Hi, gives the squareroot of the ratio of the determinant of [MSij/02] and
[&fSij/a2  pipj]: 1.11 References
Introductio~zto probabilities
W. Feller. An Introduction to Probability Theory and its Applications, Wiley, New York.
1971.
We then introduce a variable q =: o' (oi?/IV which we fix using an integral Estrenze value stafistics
representation of the delta function: E. J. Guinbel. Stur~sticsof Extremes, Columbia University Press, New York. 1958
view, the influence of the average return is negligible for short time scales, but
becomes crucial on long time scales. Typically, a stock varies by several per cent
within a day, but its average return is, say, 10% per year, or 0.04% per day. Now, 1
the 'average return' of a financial asset appears to be unstable in time: the past where the returns q k have a fixed variance. This model is actually more
return of a stock is seldom a good indicator of future returns. Financial time series con~monlyused in the financial literature. We will show that reality must
are intrinsically nonstationary: new financial products appear and influence the
markets, trading techniques evolve with time, as does the number of participants
and their access to the markets, etc. This means that taking very long historical
1 be described by an intermediate model, which interpolates between a purely
additive model, Eq. (2.1), and a multiplicative model, Eq. (2.2).
In the whole modem financial literature, it is postulated that the relevant variable
is not the increment 6 x itself, but rather the return q = 6xl.x. It is therefore
interesting to study empirically the variance of 6x, conditioned to a certain value
of the price x itself, which we shall denote (6x2)I,. If the return q is the natural
random vasiable, one should observe that 4

= rrlx, where 01is constant
(and equal to the RMS of q). Now, in many instances (Figs 2.2 and 2.4), one
rather finds that g m
' is independent of x , apart from the case of exchange
rates between comparable currencies. The case of the CAC 40 is particularly
interesting, since during the period 199195, the index went from 1400 to 2100,
leaving the absolute volatility nearly constant (if anything, it is seen to decrease
with n!).

On longer time scales, however, or when the price x rises substantially, the RMS
of Sx increases significantly, as to become proportional to x (Fig. 2.3). A way to
Fig. 2.1. Charts of the studied assets between November 1991 and February 1995. The top
chart is the S&P 500. the middle one is the DEW$, and the bottom one is the longterm model this crossover from an additive to a multiplicative behaviour is to postulate
German interest rate (Bund). that the RMS of the increments psogressively (over a time scale T,) adapt to the
changes of price of x. Schematically, for T < To,the prices behave additively,
 whereas for T > T,, multiplicative effects start playing a significant role:"
between 58 a17d 7.5 cents (Fig. 2. I (middle)).Since the interbank settlement prices are nor
~vailahle,we have dejned the price as the average between the hidand the ask prices."
Finall?, rhe choserz ititeresf rate index is thejidtures contract on longterm Gerrtlan bonds
iB~rnd),quoted on the London International Financial Futztres and Options Exchange
(LIFFE). It is t?picnlly v a ~ i n gbeween 85 and I00 poinrs [Fig. 2.I (buttorn)).
There IS, on all financial markets. a difference between the bid price and the ask price for a certain asset at a
, d~fferencebetwetn the two is ca!led the 'bid/& spread'. The more liquid a market,
glaen instant of t ~ r n sThe ' I n the additwe regime, where the \arinnce of the increment5 can be taken as a constant. we shall write (6ri; =
;he \mailer the dvrrage spread. 0?x2 Dr.
1,o 
Fig. 2.3. RMS of the increments Sx, conditioned to a certain value of the price x, as a
function of x, for the S&P 500 for the 198598 time period.
0.015
I
0.010
, 1
80 85 90 95 100
X
Fig. 2.2. RMS of the increments 6x,conditioned to a certain value of the price K, as a
function of x, for the three chosen assets. For the chosen period, only the exchange rate
DEMI$ conforms to the idea of a multiplicative model: the straight line corresponds to the
I:'*
best fit (8x2) = ~r1.x.The adequacy of the multiplicative model in this case is related to
the symmetry $IDEM 4 DEW$.
Fig. 2.4. RMS of the increments Sx, conditioned to a certain value of the price x, as a
function of X ,for the CAC 40 index for the 199195 period; it is quite clear that during
 .+.. , 1 was almost independent of x.
that time period (8x2)
0.05 .  , '
0 15 30 45 60 75 90
Fig. 2.5. Normalized correlation function Clt for the three chosen assets, as a function
of the time difference lk  Ilr. and for t = 5, min. Up to 30 min, some weak but Fig. 2.5. Nornlalized correlation function Cil for the three chosen assets, as a function of

significant correlations do exist (of amplitude 0.05). Beyond 30 min, however, the
twopoint correlations are not statistically significant.
I
i
the time difference lk  /IT. now on a daily basis, r = 1 day. The two horizontal lines at
1 0 . 1 correspond to a 3a error bar. No significant correlations can be measured.
II
Figure 2.5 shows this correlation function for the t h e e chosen assets, aria f& t 2 "  On very short time scales, however. weak but significant correlations do exist.
5 min. For uncorrelated increments, the correlation function should be equal These correlations are however too small to allow profit making: the potential
to zero f o r k $ 1, with an RMS equal to a = I/&, where N is the number of return is smaller than the transaction costs involved for such a highfrequency
independent points used in the computation. Figure 2.5 also shows the 3u error trading strategy, even for the operators having direct access to the markets (cf.
bars. We conclude that beyond 30 min, the twopoint correlation function cannot Section 4.1.2). Conversely, if the transaction costs are high, one may expect
b e distinguished from zero. O n less liquid markets, however, this correlation time is significant correlations to exist on longer time scales.
longer. On the US stock market, for example, this correlation time has significantly We have performed the same analysis for the daily increments of the three
decreased between the 1960s and the 1990s. chosen assets (t = 1 day). Figure 2.6 reveals that the correlation function is
Table 2.1. Value of the parameters A and a  ' , as obtained by fitting the data
i.
with a symmetric TLD I,:', of index p = Note that both A and a' ha\~ethe
dinlension of a price variation 6x1, and therefore directly characterize the nature c ~ f
the statistical fluctuations. The other columns compare the RMS and the kurtosis of
the fluctuations, as directly measured on the data, or via the formulae. Eqs (I.%),
(1.95). Note that in the case DEW$, the studied variable is 1006x/.r. In this last
case, the fit with p = 1.5 is not very good: the calculated kurtosis is found to be
too high. A better fit is obtained with p = 1.2
. 5
T h e particular value p = h a s a simple theoretical interpretation, which we
shall briefly present in Section 2.8. . 
kth variable (out of N)is' then such that.
P , ( x ~ ) ~ , ( ~, .~Pk,jXN)
). CX ,N~og~+~filogxoi!+ii)Z:~!og~<, (2.10)
The equntion&ing fir is thus, in this case:
Convolutions
The parameterization of PI (Sx) as a TLD allows one to reconstruct the distribution
of price increments for all time intervals T = N t , if one assumes that the
increments are iid random variables. As discussed in Chapter 1, one then has
~ . 2.1 1 shows the cumulative distribution for T = 1
P(Sx. N) = [ P I ( S X ~ ) ] *Figure
hour, I day and 5 days, reconstructed from the one at 15 min, according to
the simple iid hypothesis. The symbols show empirical data corresponding to
the same time intervals, The agreement is acceptable; one notices in particular
the progressive deformation of P(Sx, N) towards a Gaussian for large N. The
evolution of the variance and of the kurtosis as a function of N is given in Table
2.2, and compared with the results that one would observe if the simple convolution
rule was obeyed, i.e. a; = N a ? and KN = K,/N. For these liquid assets, the time
Rg. 2.10. Elementary cumulative distribution for the Bund. for T = 15 min, and best fit
scale T r = K I c which sets the convergence towards the Gaussian is on the order
using a symmetric TLD L E I , of index ,u = 2.
of days. However, it is clear from Table 2.2 that this convergence is slower than it
ought to be: KN decreases much more slowly than the 1/N behaviour predicted by
distributions, we have system tic all>^ ~rsedP, (6x) for the rregative increments, and an iid hypothesis. A closer look at Figure 2.1 1 also reveals systematic deviations:
I
P,(Jx)for positive Sx. P . * : I for example the tails at 5 days are distinctively fatter than they should be.
! exponents. In some cases, for example the distribution of losses of the S&P
500 (Fig. 2.12). one sees a slight upward bend in the plot of P,(x) versus x
Table 2.2. Variance and kurtosis of the distributions P(S.4.. N) measured cir
computed from the variance and kurtosis at time scale r by assuming a simple
convolution rule, leading to a; = NO; and K Y = K , / NThe kurtosis at scale Ar
is systematically too large, cf. Section 2.4. We have used N = 4 for T = 1 hour,
N = 28 for T = 1 day and N = 140 for T = 5 days
'Asset . . Variance a; Kurtosis K N
Measured Computed Measured Computed
S&P 500 15 rnin S&P 500 (T = 1 h) 1.06 1.12 6.55 3.18
V S&P 500 Ih Bund ( T = 1 h ) 9.49 x 109.8 x 10"0.9 5.88
S&P 500 1 day DEW$ (T = 1 h) 6.03 x loP2 6.56 x 7.20 5.1 1
I. S&P 500 5 days
S&P 500 (T = 1 day) 7.97 7.84 1.79 0.45
Bund ( T = 1 day) 6.80 x loP' 6.76 x 4.24 0.84
DEW$ (T = 1 day) 0.477 0.459 1.68 0.73
S&P 500 ( T = 5 days) 38.6 39.20 1.85 0.09
Bund ( T = 5 days) 0.341 0.338 1.72 0.17
DEW$ ( T = 5 days) 2.52 2.30 0.9 1 0.15
using a Student distribution to fit the daily variations of the S&P in the period
199195 isL.!, = 5. Even if it is rather hard to distinguish empirically between an
exponential and a high powerlaw: this question is very important theoretically. In
particular, the existence of a finite kurtosis requires p to be larger than 4. As far
as applications to risk control, for example, are concerned, the difference between
Fig. 2.1 1. Evolution of the distribution of the price increments of the S&P 500, P ( 6 x , N)
the extrapolated values of the risk using an exponential or a high powerlaw fit of
(symbols). compared with the result obtained by a simple convolution of the elementary the tails of the distribution is significant, but not dramatic. For example, fitting the
distribution PI(6x1) (dark lines). The width and the general shape of the curves are rather tail of an exponential distribution by a powerlaw, using 1000 days, leads to an
well reproduced within this simple convolution mle. However, systematic deviations can effective exponent p 2i 4. An extrapolation to the most probable drop in 10 000
be observed, in particular for large /6xl. This is also reflected by the fact that the kurtosis
K A ~decreases more slowly than K , /N. cf. Table 2.2. days overestimates the true figure by a factor 1.3. In any case, the amplitude of
very large crashes observed in the century are beyond any reasonable extrapolation
in a linearlog plot. This indeed suggests that the decay could b e slower than of the tails, whether one uses an exponential or a high powerlaw. The a priori
exponential. Many authors have proposed that the tails of the distribution of price probability of observing a 22% drop in a single day, as happened on the New York
changes is a stretched fixporzential exp(ISXI') with c .cr '1, or even a powerlaw Stock Exchange in October 1987, is found in any case to be much smaller than lo''
with an exponent /A in the range 3  5 . 9 0 r example, the most likely value of p per day, that is, once every 40 years. This suggests that major crashes are governed
by a specific amplification mechanism, which drives these events outside the scope
' See: J. Lahemire, D. Somette. Stretched exponential distributions in natureand ineconomy. Eu~opearzJournal
of Pirjsics, B 2. 525 (1998).
of a purely statistical analysis, and require a specific theoretical description.'
See e.g. M. M. Dacorogna. U. A. Muller, 0 . V. Pictet. C. G. de Vries, The distribution of extrernal
exchange rate returns in extremely large data sets, Olsen and Associate working paper (1995), available
at http://www.olsen.ch: F Long~n,The asymptotic distribution of extreme stock market returns, Journal of
Business 69, 383 11996): P. Gopikrislinan, M. Meyer, L. A. Amaral, H. E. Stanley, Inverse cubic law for the %n this point, see A. Johanseo, D. Somette. Stock market crashes are outliers, Europerrn Journal of Physics.
distribution of stock price canstions, European Journal of Plrysics, B 3, 139 (1998). B 1, 141 (1998), and J.P. Bouchaud, R. Cont. European Jourrral of Physics, B 6,543 (1998).
distribution. However, the second nloment ([.rN  xc,12) = D s N , is doniinated by
extreme fluctuations. and is therei'ore governed by the existence of the exponel~tial
truncation which gives a tinire \lalue to Dt proportional to A'~cu''~.One can check
Bund
that as long as N << N*, one has >> AN'//\ This means that in this case,
;
. DEW$ the second monlent overestimates the amplitude of probable fluctuations. One car1
0 S&P generalize the above result to the qth moment of the price difference, ( [ x N  x o ] q ) .
If q > y ; one finds that all tnornerzts grow like N in the regime N << N * , and
like N ~ / ifWq < p. This is to be contrasted with the sum of Gaussian variables.
where ( [ x M xo]q)grows as ~ 4 1 ' for all q > 0. More generally, one can define
an exponent lqas ( [ x N xoJq) cx N4.If I q / q is not constant with q, one speaks
of multiscaling. It is not always easy to distinguish true multiscaling from apparent
multiscaling, induced by crossover or finite size effects. For example, in the case
where onesums uncorrelated random variables with a longrange correlation in the
variance, one finds that the kurtosis decays slowly, as K N C( N  " , where v < 1 is
the exponent governing the decay of the correlations. This means that the fourth
moment of the difference xN  xq behaves as:
If v is small, one can fit the above expression, over a finite range of N,using an
ejfective exponent < 2, suggesting multiscaling. Similarly, higher moments can
be accurately fitted using an effective exponent I, < q/2.1 This is certainly a
possibility that one should keep in mind, in particular when analysing financial
time series (see Mandelbrot. 1998).
Another interesting way to characterize the temporal development of the fluctu
ations is to study, as suggested by Hurst, the average distance between the 'high'
and the 'low' in a window of size r = fir:
Fig. 2.12. Cr~mulativedistribution of the price increments (positive and negative), on the
scale of N = 1 day, for the three studied assets, and in a linearlog representation. One
clearly see the approximate exponential nature of the tails, which are straight lines in this
representation. The Hurst exponent H is defined from %(n) cx n u . In the case where the
increments 6sn are Gaussian, one finds that H = (for large n ) . In the case of
a TLD of index 1 ..r p < 2, one finds:
2.3.2 lMultiscaling  Rurst exponetzt (*)
The fact that the autocorrelation function is zero beyond a certain time s c d e " . * :
 The Hurst exponent therefore evolves from an anomalously high value H = I/@
7: implies that the quantity ( [ x N  xo]') grows as DtN. However, measuring
the temporal fluctuations using solely this quantity can be misleading. Take for 4
to the 'normal' value H = as n increases. Figure 2.13 shows the Hurst function
example the case where the price increments 6sk are independent, but distributed R ( n ) for the three liquid assets studied here. One clearly sees that the 'local'
according to a TLD of index p < 2. As we have explained in Section 1.6.5, exponent H slowly decreases from a high value (0.7, quite close to l / y = at 5)
the sum YN  xo = z,"=o1
Axk behaves, as long as N << N" = K , as a , 4
sinall times. to H . at Ion, times.
pure 'LCvy' sum, for which the truncation is inessential. Its order of magnitude
is therefore X,N  xq A N 1 j p ,where A"^ is the tail parameter of the U v y .
' O For more derails on this point. see. 1.P. Bouchaud. M, Potters. M. Meyer. Apparent multifrac.ctalityin financial
time series, Eirropean Joun!nl off'h~sics.13.595 (2000).
Stntiviics of re01 prices

 S&P 5M)
DEMiS
BUND
Fig. 2.13. Hurst Function X(n) (up to an arbitrary scaling factor) for the three liquid assets
studied in this chapter, in loglog coordinates. The local slope gives the value of the Hurst Fig. 2.14. Kurtosis KN at scale N as a function of N , for the Bund. In this case, the
exponent H . One clearly sees that this exponent goes from rather a high value for small n
I
<
elementary time scale is 30 min. If the iid assumption were true, one should find
4
to a value close to when n increases. KN = K I I N . The straight line has a slope of 0.43, which means that the decay of
the kurtosis KN IS much smaller, as 2: 2 0 / N 0 43.
As mentioned above, one sees in Figure 2.1 1 that P(Sx, N)systematically deviates where 6xk is the 5min increment, and Nd is the number of 5min intervals within
from [ P I ( G X I ) ] " ~In. particular, the tails of P(6x, N) are anomalously 'fat'. a day. This quantity is clearly strongly correlated in time (Figs 2.15 and 2.16): the
Equivalently, the kurtosis K,V of P(Gx, N) is higher than K ~ / Nas, one can see periods of strong volatility persist far beyond the day time scale.
from Figure 2.14, where KN is plotted as a function of N in loglog coordinates. A simple way to account for these effects is to assume that the elementary
Correspondingly, more complex correlations functions, such as that of the distribution PI also depends of time. One actually observes that the level of activity
squares of &xk,reveal a nontrivial behaviour. An interesting quantity to consider on a market (measured by the volurne of transactions) on a given time interval
r (years)
0 Bund
0 SP
Fig. 2.15. Evolution of the 'volatility' y as a function of time, for the S&P 500 in the o DEM/$
period 199195. One clearly sees periods of large volatility, which persist in time. .. 1p 0 . 5
can vary quite strongly with time. It is reasonable to think that the scale of the
fluctuations y of the price depends directly on the frequency and volume of the
transactions. A simple hypothesis is that apart from a change of this level of
activity, the mechanisms leading to a change of price are the same, and therefore
that the fluctuations have the same distributions, up to a change of scale. More
precisely, we shall assume that the distribution of price changes is such that:
Fig. 2.16. Temporal corrhation function, (ynyk+!) ( y k ) ' , normalized by (y:)  (yi)'.
The value l = 1 corresponds to an interval of 1 day. For comparison, we have shown a
where Plo(cr)is a certain distribution normalized to 1 and independent of k. The
factor yk represents the scale of the fluctuations: one can define yk and P l o such
decay as 1/a.
that [ iu I Plo(u) dl( = 1. The variance Dkr is then proportional to y,?.
1
In the case where Plo is Gaussian and in the limit of continuous time, the model where KO is the kurtosis of the distribution P l o defined above (cf. Eq. (2.16)), and
defined by Eq. (2.16) is known in the literature as a 'stochastic volatility" model. I
g the correlation function of the variance Dkr(or of the yf):
The model defined by Eq. (2.16) is however more general since P l o is a j17iori
arbitrary.
If one assumes that the random variables Gxk/yk are independent and of zero The overbar means that one should average over the fluctuations of the Dk.It is
mean, one can show (see Section 1.7.2 and Appendix A) that the average kurtosis intetesting to notice that even in the absence of 'bare' kurtosis ( K =~ O), volatility
of the distribution P(Sx, N) is given, for N 2 1, by: fluctuations are enough to induce a nonzero k~lrtosisK I = KO f (3 f ~o)g(O).
The empirical data on the kurtosis are well accounted for using the above
formula, with the choice g ( t ) oc !", with v = 0.43 in the case of the Bund. This
"
choice for g ( ! ) is also in qualitative agreement with the decay of the correlations
of the y ' s (Fig. 1.16). However, a fit of the data using for g ( ! ) the sum of two
exponentials e ~ p (  t / & ! , is
~ )also acceptable. One finds two rather different time
scales: el is shorter than a day, and a long correlation time l 2 of a few tens of
days.
One can thus quite clearly see that the scale of the fluctuations (known in the
market as the volatility) changes with time, with a rather long persistence tiine
scale. This slow evolution of the volatility in turn leads to an anomalous decay
of the kurtosis KN as a function of N.As we shall see in Section 4.3.4, this has
direct consequences for the dynamics of the volatility smile observed on option
marke ts.
1oo
because the cilmulative distribution of the daily changes of the rate MXP/$ reveals
6s 6'
powerlaw tails, with no obvious truncation, with an exponent /1, = 1.5 (Fig. 2.18).
This datzset corresponds to the years 199294, just before the crash of the peso
Fig. 2.17. Cumulative distribution 'Pl,(6,x) (for 6 x > 0) and Ti,(dx) (for 6x < O), for
(December 1994). A similar value of p has also been observed, for example, in the
~
the US 3month rate (US TBills from 1987 to I996), with r = 1 day. The thick line
fluctuations of the Budapest Stock Exchange." corresponds to the best fit using a symmetric TLD L::', of index LL = ;. We have also
Another interesting quantity is the volatility itself which varies with time, as shown the corresponding values of A and a  l , which gives a kurtosis equal to 22.6.
emphasized above. The price of options reflect quite accurately the value of the
historical volatility in a recent past (see Section 4.3.4). Therefore, the vojatiljty
can be considered as a special type of asset, which one can study as such. We  measured volatility y is shown in Figure 2.19 for the S&P 500, but other assets
shall define as above the volatility yk as the average over a day of the absolute lead to similar curves. This distribution decreases slowly for large y's, again as an
value of the 5min price increments. The autoconelation function of the y's is exponential or a high powerlaw. Several functional forms have been suggested,
shown in Figure 2.16; it is found to decrease slowly, perhaps as a powerlaw such as a lognonnal distribution, or an inverse Gamma distribution (see Section
with an exponent v in the range 0.1 to 0.5 (Fig. 2.1 6).12 The distribution of the 2.9 for a specific model for this behaviour). However, one must keep in mind that
the quantity y is only an approximation for the 'true' volatility. The distribution
" 1. Rotyis, G. Vattay, Statistical analysis of the stock index of the Budapest Stock Exchange, in [Kondor and
Kenecz].
shown in Figure 2.19 is therefore the convolution of the true dist~hutionwith a
'' On this point. see, e.2. [Dinz. Ameodo], and Y. Liu er 01.. The statistical propenies of volatility of price measurement error distribution.
fluctuations, Phvsicai Revieri, E 60. 1390 (1999).
"I.
Fig. 2.19. Cumulative distribution of the measured volatility y of the S&P, Pi,(y), in a
linearlog plot. Note that the tail of this distribution decays more slowly than exponentially.
i r
v
M W $ (drops?
MXPI$ (rises? Present models of the interest rate curve fall into two categories: the first
The case of the interest rate curve is particularly complex and interesting, s i k e it 2.6.1 Presezztation of the data and notations
is not the random motion of a point, but rather the consistent history of a whole
The forward interest rate curve (FRC) at time t is fully specified by the collection
curve (corresponding to different loan maturities) which is at stake. The need for
of all forward rates f ( t , 8).for different maturities 8. It allows us for example to
a consistent description of the whole interest rate curve is furthermore enhanced
calculate the price B(r, 0) at time t of a (socalled 'zerocoupon'j bond, which by
by the rapid development of interest rate derivatives (options, swaps,'3 options on
swaps, etc.) [Hull]. la For a compilation of the most important theoretical papers on the interest rare curve, see. [Hughston].
This section is based on the foliowing papers: J.P. Bouchaud, N. Sagna, R. Cont, N. ElKaroui, M. Potters,
Phenomenology of the interesi rate curve. Ap;.died Mnrhernaricnl Finance. 6. 209 (1999) and idem, Stnngs
" A swap is a contract where one exchanges fixed interest rare payments uith floating intereat rate paymen!i. attached, Risk Magazine, 11 (71.56 (1998)
definition pays 1 at tirne r + 8.The forward rates are by definition such that they
compound to give B ( r , 8):
r (r) = f ( t ,0 = 0) is called the 'spot rate'. Note that in the following 0 is always
a time difference; the maturity date T is t 8. +
Our study is based on a dataset of daily prices of Eurodollar futures contracts on
interest rates.16 The interest rate underlying the Eurodollar futures contract is a 90
day rate, earned on dollars deposited in a bank outside the US by another ba&. The
interest in studying fonvard rates rather than yield curves is that one has a direct
access to a 'derivative' (in the mathematical sense: f ( t , 0) = 8 log B(t, Q)/a@):
which obviously contains more precise information than the yield curve (defined
from the logarithm of B ( t , 0)) itself.
In practice, the futures markets price 3months forward rates forjxed expiration Fig. 2.20. The historical time series of the spot rate r ( t ) from 1990 to 1996 (top curve) 
dates, separated by 3month intervals. Identifying 3months futures rates with actually corresponding to a 3month future rate (dark line) and of the 'spread' s ( t ) (bottom
instantaneous forward rates, we have available a sequence of time series on forward curve),defined with the longest maturity available over the whole period 199096 on future
markets, i.e.,,,8 = 4 years.
rates f ( r , T,  t), where are fixed dates (March, June, September and December
of each year). We can convert these into fixed maturity (multiple of 3months)
forward rates by a simple linear interpolation between the two nearest points such The two basic quantities describing the FRC at time t are the value of the short
that 7;:  t _( @ _( 7;:+!  t. Between 1990 and 1996, one has at least 15 different term interest rate f ( t , @,,) (where Qmin is the shortest available maturity), and that
Eurodollar maturities for each market date. Between 1994 and 1996, the number of of the shortterm~longtermspread s ( t ) = f (t, Q,,,)  f ( r , Hmin),where em, is the
available maturities rises to 30 (as time grows, longer and longer maturity forward longest available maturity. The two quantities r ( t ) 2 f (t. Bmin), s(f) are plotted
rates are being traded on future markets); we shall thus often use this restricted
.
versus time in Figure 2.20;'~note that:
dataset. Since we only have daily data, our reference time scale will be t = 1 day.
The variation of f (t. 0) between t and r +
will be denoted as Sf ( t , 0):
The volatility o of the spot rate r ( t ) is equal to 0 . 8 % / ~ . ' ~ hobtained
averaging over the whole period.
i s by
The spread s(t) has varied between 0.53 and 4.34%. Contrarily to some
European interest rates on the same period, s(l) has always remained positive.
(This however does not mean that the FRC is increasing monotonically, see
2.6.2 Quantities of interest and data analysis below.)
The description of the FRC has two, possibly interrelated, aspects:
 Figure 2.21 shows the average shape of the FRC, determined by averaging
(i) What is, at a given instant of time, the shape of the FRC as a function of  the difference f ( t , @) r(f) over time. Interestingly. this function is rather well
the maturity 0? fitted by a simple squareroot law. This means that on average. the difference
(ii) What are the statistical properties of the increments S f (r, 0) between time between the forward rate with maturity 0 and the spot rate is equal to ad%, with a
+
t and time t r , and how are they correlated with the shape of the FRC at
/'
proportionality constant a = 0.85%/@ which turns out to be nearly identical
time t? ro the spot rate volatility. We all propose a simple interpretation of this fact below.
l 7 We shall from now on take rht 3month rate as an approximation to the spot rate ? ( t i .
'' In princrple forward contwcrs and futures contracis are no! stricriy identicalthey have different margin The dimension of r should really be % per year, but we conform here to the habit of quoting I stmply in %.
requirementsand one may expect slight diffe~ences,uhich we shall neglect in the following. Not? tliat this can sometimes be confusing when checking rhe correct dimensions of a formula.
.Sf~fi.\ti<..\
of W<I/ />!i<.e,\ Zh S~rriisiit.niirtiiiIj,\.i.\ (I/ ilic FKC
0.0 1 4 1
0.0 0.5 I .O 1.5
0 2 4 h 8
ftl!" ;:6 (in sqrt years) @ (years)
Fig. 2.21. The average FRC in the period 199496, as a function of the maturity 0. We Fig. 2.22. Root mean square deviation A ( @ )from the average FRC as a function of 0. Note
have shown for comparison a one parameter$t with a squareroot law, a ( d  &). I the maximum for 0* = I year, for which A 2 0.38%. We have also plotted the correlation
The same & behaviour actually extends up to 8,, = 10 years. which is available in the
I function C($) (defined by Eq. (2.23)) between the daily variation of the spot rate and that of
second half of the time period. the forward rate at maturity 0. in the period 199496. Again, C ( B ) is maximum for # = Q*,
I
I
and decays rapidly beyond.
Let us now turn to an analysis of the$uctuations around the average shape. These I
fluctuations are actually similar to that of a vibrating elastic string. The average Figure 2.23 shows ( ~ ( 8 )and ~ ( 8 ) Somewhat
. surprisingly, (T (8), much like
deviation A(%)can be defined as: I
I
A(B) has a maximum around O* = 1 year. The order of magnitude of ( ~ ( 0 is )
0.05%/fi, or 0.8%Im. The daily kurtosis ~ ( 8 is) rather high (on the order
of S), and only weakly decreasing with 8.
Finally, C ( 8 ) is shown in Figure 2.22; its shape is again very similar to those of
A(%)and (T (8), with a pronounced maximum around 8* = 1 year. This means that
and is plotted in Figure 2.22, for the period 199496. The maximum of A is reached the fluctuations of the shortterm rate are anzpltfed for maturities around 1 year.
for a maturity of Q* = 1 year. We shall come back to this important point below.
We now turn to the statistics of the daily increments Sf (t . tl) of the forward rates,
by calculating their volatility cr (8) = and their excess kurtosis
2.63 Comparbon with the Vasicek model
The simplest FRC model is a onefactor model due to Vasicek, where the whole
 term structure can be ascribed to the shorttenn interest rate. The latter is assumed
A very important quantity will turn out to be the following 'spread' correlation i  to follou~a socalled 'OmsteinUhlenbeck' (or mean reverting) process defined as:
function: I
which measures the influence of the shortterm interest fluctuations on the other where ro is an 'equilibrium' reference rate, i2 describes the strength of the
modes of motion of the FRC, subtracting away any trivial overall translation of the reversion towards ro (and is the inverse of the mean reversion time), and { ( t )is
FRC. a Gaussian noise, of volatility I . In its simplest version, the Vasicek model prices
The basic results of this model are as follows:
and should thus be rzegative, at variance with empirical data. Note that in the
limit R@<< 4, tfie order of magnitude of this (negative) term is very small:
taking cr = I%/ and B = I year, it is found to be equal to 0.005%, much
0.05 '
0 4
Q [gears)
6 8
1
. smaller than the typical differences actually observed on forward rates.
The volatility a(@) is monotonically decreasing as exp a@,
K ( @ ) is identically zero (because is Gaussian).
while the kurtosis
Fig. 2.23. The daily volatility and kurtosis as a function of maturity. Note the maximum of
the volatility for B = B", while the kurtosis is rather high, and only very slowly decreasing up to a tenn of order a* which turns out to be negligible. exactlyfor the sunie reason
'
with 8. The two curves correspond to the periods 199096 and 199496, the latter period as explairzed obove. On average, the second tern (estirnated by taking 0 finite difference
extending to longer maturities. estitnate o f tile partiel derivutit~eusing ihejirst &lo points qf the FKC) is definitely fo~iizd
to be positil'e, aid equal to O.S%/yeau: Orz the same period (199096)' Izowever; the spot
rate has decreused from 8.1 to 5.9%, iizstead ofgrowing by 7 x 0.8% = 5.6%.
a bond maturing at T as the following average:
In simple terms, both the Vasicek and the HullWhite model mean the following:
the FRC should basically reflect the market's expectation of the average evolution
of the spot rate (up to a correction of the order of cr2, but which turns out to be
where the averaging is over the possible histories of the spot rate between now and very small, see above). However, since the FRC is on average increasing with the
the maturity, where the uncertainty is modelled by the noise {. The computation of maturity (situations when the FRC is 'inverted' are comparatively much rarer),
the above average is straightfornard when { is Gaussian, and leads to (using Eq. this would mean that the market systematically expects the spot rate to rise, which
(2.19)): it does not. It is hard to believe that the market persists in enor for such a long
tiine. Hence, the upward slope of the FRC is not only related to what the market
expects on average. but that a systematic risk premium is needed to account for this
f ( t , Q) = r ( t ) f (rO r(f))(l  e"@)  (Ia'  en0)2. (2.26)
2R2 increase.
2.6.4 Riskpremium and rlte %/;7law
The crizrm,geFRC cnd valuei~triskpriiii~g
The observation that on average the FRC follows a simple +%taw ii.e. ( f ( r ,0 ) 
) &) suggests an intuitive; direct interpretation. At any time r, the market
~ ( r ) cx
anticipates either a future rise. or a decrease of the spot rate. However, the tc.here n ~ ( rt ,i ) can be ciilled the arzticipiited bias iit rime r', seerzfiorn time r.
average anticipated trend is, in the long run, zero. since the spot rate has bounded . If is reasonable to think rho! rile market estinlares m IJ?. e.rrmpoInfing rhe recent past ro
the n e a r ~ j u . f u r e . ~ a r k e m u t i c t this
~ l l ~reads:
;
fluctuations. Hence, the average market's expectation is that the future spot rate
t.(t) will be close to its present value r ( t = 0). In this sense, the average FRC
should thus be flat. However, even in the absence of any trend ?il the spot rate, its
probable change between now and t = 0 is (assuming the simplest random walk
m(t, t + u ) = n z l ( f ) Z ( u where
) m l ( t )=
I" K ( t ~ ) S r (t u ) du,
and where K ( u ) is an averugirrg kernel of the past yariatioris of the spot rate. One m y call
(2.32)
behaviour) of the order of crd,where cr is the volatility of the spot rate. Money Z(11) the trend persistencejit~ction;it is normalized such that Z ( 0 ) = 1, and describes
tenders agree at time t on a loan at rate f ( t ,Q),Mhich will run between time t 0 + how the present trend is expected to persist in the future. Eyr~ation(2.29) then gives:
+ +
and t 0 d o . These money lenders will themselves borrow money fromcentral
+
banks at the shortterm rate prevailing at that date. i.e. r ( t 8). They will therefore
+
lose money whenever r ( t 6 ) > f i t , 6 ) . Hence, money lenders take a bet on This mechanism is a possible explarzation of why the fhree fcincrions introduced above,
the future value of the spot rate and want to be sure not to lose their bet more namely A(@), a ( @ )and the correlationfianction C(6) hnve similar shapes. Indeed, taking
frequently than, say, once out of five. Thus their price for the forward rate is such for simplicit)' an exponential averaging kernel K ( u ) of the fomz c exp[cu], onefinds:
+ +
that the probability that the spot rate at time t 0, r ( t 8 ) actually exceeds f ( t , 8 )
is equal to a certain number p:
where ( ( t ) is an independent noise of strength a;, added to introduce some extra noise irz
the determination of the anticipated bias. In the absence ojtenzporal correlations, one can
compute from the above equation the average value of rn:. It is given by:
where P ( r f ,t f l r ,t ) is the probability that the spot rate is equal to r' at time t'
knowing that it is r now (at time t ) . Assuming that r' follows a simple random
walk centred around r ( t ) then leads to:" In the simple model defined by Eq. (2.33) above, one ,finds that the correlation functior~
C(B) is given b , ~ : ~ '
(2.36)
which indeed matches the empirical data, with p 2 0.16.
Hence, the shape of today's FRC can be thought of as an envelope for the Using the above result for (nz:), one a l s o f i ~ l d ~ :
probable future evolutions of the spot rate. The market appears to price future rates
through a Value ar Risk procedure (Eqs. (2.29) and (2.30)see Chapter 3 below)
rather than through an averaging procedure.
thus showing that A ( 0 ) and C(Qi are in rhis rnodel simply proporfiorziil.
The anticipated trend and the volatiEit?l hump Turnling now to the volatilit~a ( @ )ortefinds
, that it is given I,?.:
Let us now discuss, along the same lines, the shape ofrhe FKC at a given instant of time,
which of course deviates,frorrz the average square root law. For a given instant of fime t, the ,
market actually expects the spot rate fo perform a biased random walk. We sirall argue that We thus see that tile maximum of a ( @ )is hdeed relared to that o f C(6'). Intuitively, the
a consistent inteq~retatioriis that the market estimates the trend m ( t ) by extr~zpolatiizgtile reason for the volatility tna.xirrzum is iis follows: n variation in the spot rate cJlanges that
''the scale
Thisassumption certainly inadequate for small times, where large kurtosis effects
is
of niontl~s.these nonGaussian effects can
heconsidered as small.
are present. However. on 20 In reality, one should also take into account the fact that a u ( 0 ) can vary with time. This brings an extra
contribution both to C(B) and to n(f3).
compared to M , one should expect that the determination of the oov*i'.
~ I I ~ I ~ C L ~ S
1
I
1 is noisy, and therefore that the empirical correiation matrix is to a large extent
/>Data
0.08     Theory. usithout sprzad voi. random. i.e. the structure of the matrix is dominated by 'measurement' noise. IF
 Theory. with sprzad vol. I
 this is the case, one should be very careful when using this correlation matrix in
I
0.07 1
?. I i applications. Fro111 this point of view, ir is interesting to compare the properties of
e I I an empirical correlation matrix C to a 'null hypothesis' purely rcrndonl matrix as
b
I i one could obtain from a finite time series of strictly uncorrelated assets. Deviations
1 I
0.06 1
from the random matrix case might then suggest the presence of true information."
005
_ 
The empirical correlation matrix C is constructed from the time series of price
 I
I
changes Sx; (where i labels the asset and k the time) through the equation:
r 4 I
I
0.04
0 2 4 6 8
Q (years)
In the following we assume that the average value of the &x's has been subtracted
Fig. 2.24. Comparison between the theoretical prediction and the observed daily volatility
of the forward rate at maturity 8 , in the period 199446. The dotted line corresponds to Eq. off, and that the 6x's are rescaled to have a constant unit volatility. The null
(2.38) with 0~= a ( 0 ) ,and the full line is obtained by adding the effect of the variation of 1 hypothesis of independent assets, which we consider now, translates itself in
I
the coefficient a o ( 0 ) in Eq. (2.33). which adds a contribution proportional to 8. I the assumption that the coefficients &x; are independent, identically distributed,
random variables.23 he theory of random matrices, briefly expounded in Section
rzarket anticipationfor the trend n11(r). Btir this change of trend obviously has a larger 1.8, allows one to compute the density of eigenvalues of C, pc (A), in the limit of
effect when multiplied by a longer r?~atwi@For maturities bqyond I yeav; however; the very large matrices: it is given by Eq. (1.120), with Q = N I M .
decay of the persistence fuizctiorz conres into play and the volatility decreases again. The Now, we want to compare the empirical distribution of the eigenvalues of the
relation Eq. (2.38) is tested agairlsr real data in Figure 2.24. An important prediction of
the model is that the deformtiorr of the FRC shoicld be strongly correlated with the past correlation matrix of stocks corresponding to different markets with the theoretical
trend of the spot rate, averaged overa time scale l/c (see Eq. (2.34)).This correlation has prediction given by Eq. (1.120), based on the assumption that the correlation
been convincingly esiabiished rpcerrri~,with l/c 2 100 days.21 matrix is random. We have studied numerically the density of eigenvalues of the
correlation matrix of M = 406 assets of the S&P 500, based on daily variations
during the years 199196, for a total of N = 1309 days (the corresponding
2.7 Correlation matrices (*)
value of Q is 3.22). An immediate observation is that the highest eigenvalue il
As we shall see in Chapter 3, an important aspect of risk management is the I is 25 times larger than the predicted A,, (Fig. 2.25, inset). The corresponding
estimation of the correlations bstween the price movements of different assets. I eigenvector is, as expected, the 'market' itself, i.e. it has roughly equal components
The probability of large losses for a certain portfolio or option book is dominated on all the M stocks. The simpjest 'pure noise' hypothesis is therefore inconsistent
by correlated moves of its different constituents  for example, a position which is
simultaneously long in stocks and short in bonds will be risky because stocks andr
bonds move in opposite directions in crisis periods. The study of conelation(or
covariance) matrices thus has a long history in finance, and is one of the cornerstone
. &
1
.
with the value of A l . A more reasonable idea is that the components of the
correlation matrix which are orthogonal to the 'market' is pure noise. This amounts
to subtracting the contribution of A,, from the iloininal value cr' = I , leading to
o 2 = I  A,,,, jiM = 0.85. The corresponding fit of the empirical distribution is
of Markowitz's theory of optimal portfolios (see Section 3.3). However, a reliable
empirical determination of a correlation matrix turns out to be difficult: if one
considers M assets, the correlation matrix contains M(M  1)/2 entries, which
I shown as a dotted line in Figure 2.25. Several eigenvalues are still above A,, and
might contain some information, thereby reducing the variance of the effectively
6xk = q t:
A
N!A)~(A?, (2.41)
2.9 A simple model with volatility correlations and tails (*)
In this section, we show that a very simple feedback model where past high values
where p(A) is the common opinion of all operators belonging to A. The statistics
of the volatility infiuence the present market activity does lead to tails in the
of the price increments 6 x k therefore reduces to the statistics of the size of clusters,
probability distribution and, by construction, to volatihty correlations. The present
a classical problem in percolation theory IStauffer]. One finds that as long as p 4 1
model is close in spirit to the ARCH models which have been much discussed in
(less than one 'neighbour' on average with whom one can exchange information),
this context. The idea is to write:
then all N ( A ) ' s are small compared with the total number of traders M.More
precisely, the distribution of cluster sizes takes the following form in the limit Xk+l = Xk f ~rkek, (2.43)
where 1  p = E << 1:
where tkis a random variable of unit variance, and to postulate that the present day
volatility a k depends on how the market feels the past market volatility. If the past
I
price variations happened to be high, the market interprets this as a reason to be
When p = I (percolation threshold), the distribution becomes a pure powerlaw more nervous and increases its activity, thereby increasing ak. One could therefore
+
with an exponent 1 p = ;, and the CLT tells us that in this case, the distribution I
consider, as a toymodel:25
of the price increments 6x is precisely a pure symmetric LCvy distribution of index
p = (assuming that (F = i 1 play identical roles, that is if there is no global bias
pushing the price up or down). If p .= 1, on the other hand, one finds that the Levy
1
i
~ L + I oh = (1  ~ ) ( a ka01 Chelaktkl,
which means that the market takes as an indicator of the past day activity the
(2.44)
Eq. (2.42) is valid, and nontrivial statistics follows. One should thus explain and going to a continuoustime fonnulation, one finds that the volatility probability
why the value of p spontaneously stabilizes in the neighbourhood of the cfitical F . distribution P(a. t ) obeys the following 'FokkerPlanck' equation:
value p = 1. Certain models do actually have this property, of being close to I  
! aP(a.t )
or at a critical point without having to fine tune any of their parameters. These
at
=
a ( a  aa)P(.. t )
a0 + c2t2@ a 2aPn (a,
2
t)
'
(2.46)
I
models are called 'selforganized critical' [Bak et ul.]. In this spirit, let us mention I
a very recent model of Sethna et al. [Dahmen and Sethna], meant to describe where a. = 0 0  ht(lrrcl), and where c2 is the variance of the noise I,(. The
1
the behaviour of magnets in a time dependent magnetic field. Transposed to the equitibrium solution of this equation, P,(a), is obtained by setting the lefthand
,I .
present problem, this model describes the collective behaviour of a set of traders
exchanging information, but having all different u priori opinions. One trader can
I '' In the sirnplsst ARCH model. the follovi~ngequation is rather written i n terms of the variance, and secoitd
side to zero. One finds: approxinlation. one has: T* = K I r , which is equal to several days, even on
very liqu~dmarkets.
A time scale corresponding to the correlation time of the volatility Ructuations,
which is of the order of I 0 days to a month or even longer.
\\it11 y = 1 + (c2t)I > 1. Now, for a large class of distributions for the random
And finally a time scale To governing the crossover from an additive model,
noise tr,for example Gaussian, it is easy to show, using a saddlepoint calculation,
where absolute price changes are the relevant random va~iables,to a multiplica
that the tails of the distribution of 6.u are powerlaws, with the same exponent p.
tive model, where relative returns become relevant. This time scale is also of the
Interestingly, a shortmemory market, corresponding to c 21 1, has much wilder
order of months.
tails than a longmemory market: in the limit t + 0, one indeed has p + m. In
other words, overreactions is a potential cause for powerlaw tails. It is clear that the existence of all these time scales is extremely important to
take into account in a faithful representation of price changes, and play a crucial
role both in the pricing of derivative products, and in risk control. Different assets
2.10 Conclusion differ in the value of their kurtosis, and in the value of these different time scales.
The above statistical analysis reveals very important differences between the For thiS reason, a description where the volatility is the only parameter (as is the
simple model usually adopted to describe price fluctuations, namely the geometric case for Gaussian models) are bound to miss a great deal of the reality.
(continuoustime) Brownian motion and the rather involved statistics of real price
changes. The geometric Brownian motion description is at the heart of most I
2.11 References
theoretical work in mathematical finance, and can be summarized as follows:
Scaling and Fractals i n Financial Markers
e One assumes that the relative returns (rather than the absolute price increments) B. B. Mandelbrot, The Fractal Geometry of Nature, Freeman, San Fwncisco, 1982.
are independent random variables. B. B. Mandelbrot, Fractals and Scaling irr Finance, Springer, New York, 1997.
e One assumes that the elementary time scale t tends to zero; in other words that J. LCvyVChel, Ch. Walter, Les nmrchisfractals : eficience, ruptures et tendances sztr les
nrarchd'sjnanciers, P.U.F., to appear.
the price process is a continuoustlme process. It is clear that in this limit, the U. A. Muiler, M. M. Dacorogna, R. B. Olsen, 0. V. Pictet, M. Schwartz, C . Morgenegg,
number of independent price changes in an interval of time T is N = T / t i oo. Statisticai study of foreign exchange rates, empirical evidence of a price change scaling
One is thus in the limit where the CLT applies whatever the time scale T. law, and intraday analysis, Jour7zal ofBananking and Finance, 14, 1189 ( 1 990).
Z. Ding. C. W. J. Granger, R. F. Engle, A long memory property of stock market returns
If the variance of the returns is finite, then according to the CLT, the only possibility and a new model, Journal ofEnzpirica1 Finonce, 1, 83 (1993).
is that price changes obey a lognormal distribution. The process is also scufe D. M. Guillaun~eet al.. From the birds eye to the microscope, Finance arzd Srochusrics, 1.
I
2 (1997).
invariarzt, that is that its statistical properties do not depend on the chosen time S. Ghashghaie, W. Breymann, J. Peinke, P. Talkner, Y. Dodge, Turbulent cascades in
scale (up to a multiplicative factor see Section 1.5.3).26 foreign exchange markets, Nature, 381,767 (1996).
The main diEerence between this model and real'data is not only that the tails of A. AmCodo, J. F. Muzy, D. Somette, Causal cascade in the stock market from the 'infrared'
to the 'ultraviolet'. European Journa! of Plxysics. B 2. 277 (1998).
the distributions are very poorly described by a Gaussian law, but also that several R. h'lantegna, H. E. Sta~~ley.A11 Introduction ro Eco,~oplt)aics,Cambridge University Press,
I
important time scales appear in the analysis of price changes: Cambridge, 1999.
i i. I. Kondor, J. Kertecz (eds), Econophjsics, art Enzrrging Science, Kluwer, Dordrecht. 1999.
* A 'microscopic'time scale t below which price changes are correlated: This
time scale is of the order of several minutes even on very liquid markets. Other descriptiorrs (ARCH, etc.)
* A time scale T* = N " t , which corresponds to the time where nonGaussian R. Engle, ARCH with estimates of the variance of the United Kingdom inflation. Econo
effects begin to smear out, beyond which the CLT begins to operate. This mealica,50,987 (1982).
T . Bollerslev. Generalized ARCH, Joirrnal ofEconomerrics, 31. 307 (1986).
time scale T* depends much on the initial kurtosis on scale t . As a first C. Gourieroux, A. Montfort, Statistics and Eco~ro,nerricModi?ls, Cambridge University
Press, Cambridge, 1996.
'' This scale invariance is more general than the Gaussian model discussed here, and is the basic as~umption
underlying all 'fractal' descriprioni of financial markets. These descriptions fail to capture the existence of E. Eberlein. U. Keller. Option pricing with hyperbolic distributions, Bernoulli, 1. 281
several important rime scaler chat we discuss here. (1995).
The i t ~ t e ~ r,.cite
s t rziive
L. Hughston (ed.), I/crsicek iniil Be>~ond.Risk Publications, London, 1996.
J. C. Hull, Opfiot?s,Futures and Other Deri\:trtives,PrenticeHall, Upper Saddle River, NJ,
1997.
Extreme risks and optimal portfolios
Percolarion, collective nzodels and selforganized crificalitj
D. Stauffer, Iiztroductioi~to Pe~colation,Taylor & Francis, London, 1985.
4. OrlCan, Bayesian interactions and collective dynamics of opinion, Journal of Economic
Behaviour aad Organization, 28, 257 (1995).
P. Bak, HOW Nature Works: The Science of Selforganized Criticality, Copernicus.
Springer. New York, 1996.
K. Dahmen, J. P. Sethna, Hysteresis, avalanches, and disorder induced critical scaling,
Plzpsical Review B, 53, 14872 (1996).
Other recent market models I1 n'estplus grandefolie que de placer son salut &ns 1 'incertitude.'
H. Takayasu, H . Miura, T. Hiahayashi, K. Hamada, Statistical properties of deterministic
threshold elements: the case of market prices, Phyiysica, A184, 127134 (1992). (Madame de SCvign6, Lertres.)
M. Uvy, H. L&y, S. Solomon, Microscopic simulation of the stock market, Journal de
Physique, 15, 1087 (1995).
D. Challet, Y. C. Zhang, Emergence of cooperation and organization in an evolutionary 3.1 Risk measurement and diversification
game, Plzysica, A246,407 (1997). Measuring and controlling risks is now one of t h major
~ concern across all rnodern
P. Bak, M. Paczuski, M. Shubik, Price variations in a stock market with many agents,
Pi?ysica, A246,430 (1997). human activities. The financial markets, which act as highly sensitive economical
J.P. Bouchaud, R. Cont, A Langevin approach to stock market fluctuations and crashes, and political thermometers, are no exception. One of their rciles is actually to allow
European Journal ofPkysics, B 6,543 (1998). the different actors Inthe economic world to trade their risks, to which a price must
T. Lux, M. Marchesi, Scaling and criticality in a stochastic multiagent model, Nature, 397,
498 (1 999). thereforzbe given.
The very essence of the financial markets is to fix thousands of prices all
day long, thereby generating enormous quantities of data that can be analysed
statistically. An objective measure of risk therefore appears to be easier to achieve
in finance than in most other hurnan activities, where the definition of risk is vaguer,
and the available data often very poor. Even if a purely statistical approach to
financial risks is itself a dangerous scientists' dream (see e.g. Fig. 1.1), it is fair to
say that this approach has not been fully exploited until the very recent years, and
that many improvements can be.expected in the future, in particular concerning the
control of extreme Asks. The aim of this chapter is to introduce some classical ideas
on financial risks, to illustrate their weaknesses, and to propose several theoretical
ideas devised to handle more adequately the 'rare events' whcre the true financial
risk resides.

quantity a* gives us the order of magnitude of the deviation from the expected
return. By comparing the two tenns in the exponential, one finds that when T >>
? ir2/6i2,the expected return becomes more important than the fluctuations,
which means that the probability that x ( T ) is smaller than xa (and that the actual
The volatility is in general chosen as an adequate measure of risk associated to a rate of return over that period is negative) becomes small. The 'security horizon'
given in~~estment. We notice however that this definition includes in a symmetrical f increases with a. For a typical individuai stock, one has rii = 10% per year and
way both abnormal gains and abnormal losses. This fact is a priori curious. The a = 20% per year, which leads to a T as long as 4 years!
theoretical foundations behind this particular definition of risk are numerous: The quality of an investment is often measured by its ' S h q e ratio' S,that is,
the 'signaltonoise' ratio of the mean return riiT to the Auctuations
First, operational; the computations involving the variance are relatively simple
and can be generalized easily to multiasset portfolios.
e Second. the Central Limit Theorem (CLT) presented in Chapter 1 seems to
provide a general and solid justification: by decomposing the motion from xa
to x ( T ) in N = T1.r increments, one can write: The Sharpe ratio increases with the investment horizon and is equal to 1 precisely
when T = f .Practitioners usually define the Sharpe ratio for a 1year horizon.
x.
x ( T ) = .YO i Gsk with
k=O
Axk = xhqx (3.3)
Note chat the most probable value of x ( T ) . as given by Eq. (3.5)* is equal
to xo exp(rET), whereas the mean value of x ( T ) is higher: assuming that is
Gaussian, one finds: ,yo exp[(.GT + a?/2)]. This difference is due to the fact that
where xk = x ( t = k s ) and qc is by definition the instantaneous return.
the returns qk, rather than the absolute increments 6xk. are iid random variables.
Therefore, we have:
However, if T is short (say up to a few months), the difference between the two
R ( T ) = log = x
Nl
k=O
log(l + rid.
descriptions is hard to detect. As explained in Section 2.2.1, a purely additive
description is actually more adequate at short times. In other words, we shall often
in the following write x ( T ) as:
In the classical approach one assumes that the returns q k are independent variables.
From the CLT we learn that in the limit where N + oo,R ( T ) becomes a Gaussian
random variable centred on a given average return rii T, with .G = (log(1 + q k ) )IS, where w e have introduced the following notations: m EZ iiixa, D a 2 x i , which 
and whose standard deviation is given by Therefore, in this limit, the entire we shall use throughout the following. The nonGaussian nature of the random
probability distribution of R ( T ) is parameterized by two quantities only, r?? and a: variable is therefore the most important factor determining the probability for
any reasonable measure of risk must therefore be based on 5 . extreme risks.
To second order in q k << 1, we find: n" i=
($) and 6 j ( q ) i n 2 It is customary to subtract from the mean return rfi the riskfree rate in the definition of the Sharpe ratio
3.1.2 Risk of loss atzd 'Value at Risk' (VaR)
The fact that financial risks are often described using the volatility is actually
intimately related to the idea that the distribution of price changes is Gaussian.
In tl~eextreme case of TLCvy Auctuations', for which the variance is infinite, this
definition of risk would obviously be meaningless. Even in a less risky world, this
measure of risk has three major drawbacks:
r The financial risk is obviously associated to losses and not to profits. A definition
of risk where both events play symmetrical roles is thus not in conformity with
the intuitive notion of risk, as perceived by professionals.
r As discussed at length in Chapter 1, a Gaussian model for the price Auctuations
is never just@ed for the extreme events, since the CLT only applies in the centre
of the distributions. Now, it is precisely these extreme risks that are of most
concern for all financial houses, and thus those which need to be controlled in
priority. In recent years, international regulators have tried to impose some rules Fig. 3.1. Extreme vdue distribution (the socalled Gumbel distribution) P(A , N) when
to limit the exposure of banks to these extreme risks. P,(Sx) decreases faster than any powerlaw. The most probable value, A, has a
1 probability equal to 0.63 to be exceeded.
r The presence of extreme events in the financial time series can actually lead to a I
very bad empirical determination of the variance: its value can be substantially
changed by a few 'big days'. A bold solution to this problem is simply to remove exceed AVaR.Similarly, this definition does not take into account the value of the
the contribution of these socalled aberrant events! This rather absurd solution is maximal loss 'inside' the period s. In other words, only the closing price over the
actually quite commonly used. II period [kt, ( k+ l ) t ] is considered, and not the lowest point reached during this
time interval: we shall come back on these temporal aspects of risk in Section 3.1.3.
Both from a fundamental point of view, and for a better control of financial risks,
More precisely, one can discuss the probability distribution P ( A , N) for the
another definition of risk is thus needed. An interesting notion that we shall develop
i worst daily loss A (we choose t = 1 day to be specific) on a temporal horizon
now is the probability ofextreme losses, or, equivalently. the 'valueatrisk' (VaR).
The probability to lose an amount 6x larger than a certain threshold A on a
1 TvaR= N t = t/PVaR.Using the results of Section 1.4, one has:
given time horizon s is defined as:
I
For N large, this distribution takes a universal shape that only depends on the
I
asymptotic behaviour of Pr(6x) for 6x + m. In the important case for
where P, (6x) is the probability density for a price change on the time scale s. One
can alternatively define the risk as a level of loss (the 'VaR') AvaR corresponding practical applications wherePr(21x) decays faster than any powerlaw, one finds
! that P ( A , N) is given by Eq. (1.40), which is represented in Figure 3.1. This
to a certain probability of loss PVaR over the time interval t (for example, PvaR= , I
I
worst day observed the following year. It is clear that the Gaussian prediction is
In the general case, however, there is no useful link between a and AV,R In
systematically overoptimistic. The exponential model leads to a number which is
some cases, decreasing one actually increases the other (see below, Section 3.2.3
seven times out of 11 below the observed result, which is indeed the expected result
and 3.4). Let us nevertheless mention the Chebyshev inequality, often invoked by
(Fig. 3.1).
the volatility fans, which states that if a exists,
I Note finally that the measure of risk as a loss probability keeps its meaning even
if the variance is infinite, as for a LCvy process. Suppose indeed that P, (6x) decays
very slowly when fix is very large, as:
with 1.~ < 2, such that (Sx2) = a. AIL is the .tail amplitude' of the distribution
P,:A gives the order of magnitude of the probable values of Sx. The calculation
of is imnlediate and leads to:
which shows that in order to minimize the VaK one should minimize AM, indepen
dently of the probability level Pva.
10"
  I
,U + 33.
As we have noted in Chapter I , A/* is czctually the natural generalization of thc Fig. 3.3. Cuniulative probability distribution of losses (Rank histogram) for the S&P 500
variance in this case. Indeed, the Fourier transfornz & ( z ) of P, behaves, for small for the period 198998. Shown is the daily loss (XC1 Xo,)/ X,, (thin line, left axis) and
z , as exp(bwA/*lzliL)for EL < 2 (b& is a certain numerical factor), and simply as
I the intraday loss (Xlo  Xop)/Xop(thick line, right axis). Note that the right axis is shifted
e x p (  ~ t i ' / 2 ) for > 2. where A' = D t is precisely the variance ofp,. downwards by a factor of two with respect to the left one, so in theory the two lines should
fall on top of one another.
I
3.1.3 Temporal aspects: drawdown and cumulated loss Table 3.2. Average value of the absolute value of the openJclose daily returns and
Worst low maximum daily range (highlow) over open for S&P 500, D E W $ and Bund. Note
A first problem which arises is the following: w e have defined P as the probability that the ratio of these two quantities is indeed close to 2. Data from 1989 to 1998
for the loss observed at the end of the period [ k t , (k + l ) t ]to be at least equal
to A . However, in general, a worse loss still has been reached within this time
interval. What is then the probability that the worst point reached within the interval S&P 500 0.472% 0.9248 1.96
(the 'low') is a least equal to A? The answer is easy for symmetrically distributed DEW$ 0.392% 0.804% 2.05
increments (we thus neglect the average return rnt << A, which is justified for Bund 0.250% 0.496% 1.98
small enough time intervals): this probability is simply equal to 2 P ,
?[XI,  Xop < A] = 2P[Xc,  X,, <  A ] . (3.15) the computation of the VaR, one should simply divide by a factor 2 the probability
where XOpis the value at the beginning of the interval (open), X,,at the end (close) level PVaR appearing in Eq. (3.9).
and XI,, the lowest value in the interval (low). The reason for this is that fog each A simple consequence of this "actor 2' rule is the following: the average value of
trajectory just reaching A between k t and ( k + l ) t , followed by a path which
:
the maximal excursion of the price during time .r, given by the high minus the low
ends up above A at the end of the period, there is a mirror path with precisely over that period, is equal to twice the average of the absolute value of the variation
I from the beginning to the end of the period ({lopenclosel)j. This relation can be
the same weight which reaches a 'close' value beyond This factor of 2 in
cumulative probability is illustrated in Figure 3.3. Therefore, if one wants to take 1 tested on real data; the corresponding empirical factor is reported in Table 3.2.
into account the possibility of further loss within the time interval of interest in
Cumulated losses
"lii fact, this argumentwhich dates back ta Bachelier himself ( 1900}! assumes that the moment where the
traiectory reaches the point A can be precisely identified. T h i ~is not rhe case for a discontinuow process, Another very important aspect of the problem is to understand how losses can
for which the doubling of? is only an approximate result. accumulate over successive time periods. For example, a bad day can be follo\led
hy several other bad days. leading to a large overall loss. One would thus like to
estimate the nlost probable value of the worst week, month, etc. In other words,
one would like to construct the graph of A V a R ( N r ) ,for a fixed overall investment
horizon 1'.
The answer is straightforward in the case where the price increments are
independent random variables. When the elementary distribution P, is Gaussian,
PNr is also Gaussian, with a variance multiplied by N. At the same time, the
number of different intervals of size Nr for a fixed investrnent horizon T decreases
by a factor N.For large enough T?one then finds:
where the notation IT means that the investment period is fixed.5 The main effect
is then the increase of the volatility, up to a small logarithmic correction.
Fig. 3.4. Distribution of cumulated losses for finite N: the central region, of width
The case where P, (Sx) decreases as a powerlaw has been discussed in Chapter
 c r ~ ' / ~is, Gaussian. The tails, however, remember the specific nature of the elementary
1: for any N,the far tail remains a powerlaw (that is progressively 'eaten up' by distribution P, (6x), and are in general fatter than Gaussian. The point is then to determine
the Gaussian central part of the distribution if the exponent y of the powerlaw is whether the confidence level PvaRputs AvaR(Nt) within the Gaussian part or far in the
greater than 2, but keeps its integrity whenever y i:2). For finite N ,the largest tails.
moves will always be described by the powerlaw tail. Its amplitude A* is simply
multiplied by a factor N (cf. Section 1.5.2). Since the number of independent Drawdowns
intervals is divided by the same factor N ,one finds6 One can finally analyse the arnplitude of a cumulated loss over a period that is not a
priori limited. More precisely, the question is the following: knowing that the price
of the asset today is equal to xo, what is the probability that the lowest price ever
reached in the future will be x,in; and how long such a drawdown will last? This
independently of N.Note however that for p > 2, the above result is only valid if is a classical problem in probability theory (Feller, vol. 11, p. 4041. Let us denote
AVaR(Nt)is located outside of the Gaussian central part, the size of which growing as A ,,,  xo  xn,in the maximunl amplitude of the loss. If P, (6s) decays at least
as a f i (Fig. 3.4). In the opposite case, the Gaussian formula (3.16) should be exponentially when Sx + a, the result is that the tail of the distribution of A,,,
used. behaves as an exponential:
One can of course ask a slightly different question, by fixing not the investment
horizon T but rather the probability of occurrence. This amounts to multiplying
both the period over which the loss is measured and the investrnent horizon by the
same factor N. In this case, one finds that:  where A. > 0 is the finite solution of the following equation:
The growth of the valueatrisk with N is thus faster for small p , as expected. The
case of Gaussian fluctuations corresponds to g = 2. Note that for this last equation to be meaningful, P,(Sx) should decrease at least
' One can also take into account :he average return, mj = ( J r ) .In this case, one must subtract to A v a ~ ( M ~ ) I z as exp( ISx l/Aa) for large negative Sx's. It is clear (see below) that if P, (6x1
the quantity m 1 N (the potential losses are indeed smaller if t h e average return is positive). decreases as a powerlaw, say, the distribution of cumulated losses cannot decay
cf. previous footnote. exponentially, since it would then decay faster than that of individual losses!
li ic it~f(,l~'stl~i,q
fo .i.h~it,/loit. t/li.~w,iiilf (.(it: he ohi~iitirrl.Lei i i c itirt.oe1trc.c~the c.iin~iiIuiii~c, In the limit WIIZSC t~itcr<< 1 (i.e. that the potential losses over the time interval
tiisitihiifiot~'p,( A ) = 1:; P(A,,,,,) d A,,,,,,. T/re.fini.t ,stt;r?ufthe btciik, cf.ri;r. fix c.nrr rither
r are rnuch larger than the average return over the same period), one finds: A" =
u c c c d A, or. ~ i i i jiiboi.e i f . III rlze JSmr case, the iei5e/ A Is it~irnetliciteiyrucrcired, /rr
f l ? .(ecoird
~ crtse, one rec.oiJersthe ~~ery stitne probin11 ns iniriclily, birr ivith  A s/z(f?ed'to 1j t n r u Z .
 A  8.t.. Ther~fiire, P,( A ) obeys the following ~yrinfiott: More generally, if the width of P, ( 6 s ) is much smaller than AD,one can expand
the exponential and find that the above equation  (,n/Ao) C ( 0 / 2 A $ ) = 0 is
still valid.
g 0 ~ 7 ecan
tlegiecf rhejrst tern in the righthand sidefor large A (which sholild be self How long will these drawdowns last? The answer can only be statistical. The
consistenrly checked), then, as.vmptotically, P,( A ) shorrld obejl tire folloii~ingequation: probability for the time of first return T to the initial investment level .To, which
one can take as the definition of a drawdown (although it could of course be
P, (Sx)P, ( A + 8s)dSx. (3.22) immediately followed by a second drawdown), has, for large T, the following form:
inserrblg an e.rpot?entr'al shape for P' , ( A ) tlzeiz leads ro Eq. (3.20)for Aa, r'n the limit
A oc. This result is however only valid if P, (Ax) decays suSjiciently fast ,for 1at;qe
negurive Sx 's, such tlzar Eq. (3.20)Iza,r a not?trivial solution.
Let us study two simple cases: The probability for a drawdown to last much longer than f
is thus very small.
In this sense, f appears as the characteristic drawdown time. Note that it is nor
* For Gaussian fluctuations, one has: equal to the averag: drawdown time, which is on the order of m,and thus
much smaller than T . This is related to the fact that short drawdowns have a large
probability: as r decreases, a large number of drawdowns of order t appear, thereby
reducing their average size.
Equation (3.20) thus becomes:
A. gives the order of magnitude of the worst drawdown. One can understand It is intuitively clear that one should not put all his eggs in the same basket. A
diversijied portfolio, composed of different assets with small mutual correlations,
the above result as follows: A. is the amplitude of the probable fluctuation over
the characteristic time scale f = ~ / m 'introduced above. By definition, for is less risky because the gains of some of the assets more or less compensate the
times shorter than f, the average return m is negligible. Therefore, one has: loss of the others. s o w , an investment with a small risk and small return must
il, cc m = D/m.
sometimes be prefersed to a high yield, but very risky, investment.
The theoretical justification for the idea of diversification comes again from the
If nz = 0, the very idea of worst drawdown loses its meaning: if one waits a
CLT. A portfolio made up of M uncorrelated assets and of equal volatility, with
long enough time, the price can then reach arbitrarily low values. It is natural
to introduce a quality factor Q that cornpares the average return trz, = ~ n r
weight I / M , has an overall volatility reduced by a factor m. Correspondingly,
the amplitude (and duration) of the worst drawdown is divided by a factor M (cf.
to the amplitude of the worst drawdown Ao. One thus has Q = m i / A o ~ =
E . . (3.24)), which is obviously satisfying.
2 , n 2 r / D = 2 r / ? . The lasger the quality factor, the smaller the time needed
This qualitative idea stumbles over several difticulties. First of all, the Auctu
to end a drawdown period.
ations of financial assets are in general strongly correlated; this substantially de
* The case of exponential tails is important in practice (cf. Chapter 2). Choosing
creases the possibility of true diversification, and requires a suitable theory to deal
for simplicity P, (61) = ( 2 ~ )  'exp(a16x  mrj), the equation for A. is found
with these correlations and construct an 'optimal' portfolio: this is Markowitz's
to be:
theory, to be detailed below. Furthermore, since price fluctuations can be strongly
nonGaussian, a volatilitybased measure of risk might be unadapted: one should
rather try to minimize the valueatrisk of the portfolio. It is therefore interesting to
look for an extension of the classical formalism, allowing one to devise ~ ~ z i l ~ i t i t z i i x
VaR portfolios. This will be presented in the next sections.
Now, one is irnnlrdiately confronted with the problem of defining properly an
'optimal' portfolio. Usualty, one invokes the rather abstract concept of 'utility
functions', on which we shall briefly comment in this section, in particular to show
that it does not naturally accommodate for the notion of valueatrisk.
We will call WT the wealth of a given operator at time t = T. If one
argues that the level of satisfaction of this operator is quantified by a certain
function of WT only,7 which one usually calls the 'utility function' U ( W T ) .This
function is furthermore taken to be continuous and even twice differentiable. The
postulated 'rational' behaviour for the operator is then to look for investments
which maximize his expected utility, averaged over all possible histories of price
changes:
Fig. 3.5. Example of a 'utility function' with thresholds, where the utility function is non
continuous. These thresholds correspond to special values for the profit, or the loss, and
The utility function should be nondecreasing: a larger profit is clearly always more are often of purely psychological origin.
satisfying. One can furthermore consider the case where the distribution P ( W T )is
sharply peaked around its mean value ( W T ) = WO+. m T . Performing a Taylor of the utility function itself. However, this definition cannot allow for negative final
+
expansion of ( U ( W T ) )around U (Wo nz T ) to second order, one deduces that the wealths, and is thus problematic.
utility function must be such that: Despite the fact that these postulates sound reasonable, and despite the very large
number of academic studies based on the concept of utility function, this axiomatic
approach suffers from a certain number of fundamental flaws. For example, it is not
clear that one could ever measure the utility function used by a given agent on the
This property reflects the fact that for the same average return, a less risky n~arkets.~ The theoretical results are thus:
investment should always be preferred.
A simple example of utility function compatible with the above constraints is the Either relatively weak, because independent of the special form of the utility
exponential function U(IVT) =  exp[ W T / w 0 ]Note. that v i a has the dimensions function, and only based on its general properties.
of a wealth, thereby fixing a wealth scale in the problem. A natural candidate is the e forin for U ( W T ) .
Or rather arbitrary, because based on a specific, but u~~justified,
initial wealth of the operator, wo oc Wo,If P ( W T )is Gaussian, of mean Wa nzT + On the other hand, the idea that the utility function is regular is probably not
and variance D T , one finds that the expected utility is given by:
always realistic. The satisfaction bf an operator is often governed by rather sharp
tlzreslzolds, separated by regions of indifference (Fig. 3.5). For example, one can
he in a situation where a specific project can only be achieved if the profit A W =
Wr  Wo exceeds a certain amount. Symmetrically, the clients of a fund manager
One could think of constructing a utility function with no intrinsic wealth scale will take their money away as soon as the losses exceed a certain value: this is
by choosing a powerlaw: U ( W T )= ( W T / w O ) with
f f a. < 1 to ensure the correct the strategy of 'stoplosses', which fix a level for acceptable losses, beyond which
consexity. Indeed, in this case a change of wo can be reabsorbed in a change of scale the position is closed. The existence of option markets (which allow one to limit
' But nni of the whole 'history' of his wealth between I = 0 and T. One thus assumes that the operator Even the idea that an operator would really optimize his expected utility, and not take decisions parrly based on
is insensitive to what can happeil between these two dates; this is not very realistic. One could however .nonrational' arguments, is far from being obvious. On this point. see: M. Marsili, Y C. Zhang, Fli~ctuations
U((W(f)}05r5T).
generalize the concept of utility function and deal with i~tilityflinrlin~zr~ls around Nash equilibria in Game Theory, Physicn A245. 181 (1997).
the potential losses below a cerlain levelsee Chaptei 4), or of iterns the price This can he expressed slightly differently: a reasonable objectwe could be to
of which is $09 rather than $100, are concrete examples of the existence of these maximize the value of the 'probable gain' GI,, such that the probability of earning
thresholds where 'satisfaction' changes abruptly. Therefore, the utility function U more is equal to a certain probability /I:''
is not necessarily continuous. This remark is actually intimately related to the fact
that the valueatrisk is often a better measure of risk than the variance. Let us
indeed assume that the operator is 'happy' if A W >  A and "unhappy' whenever
A W <  A . This corresponds formally to the following utility function: In the case where P ( A W ) is Gaussian. this amounts to maximizing mT  hz/DT,
where J. is related to p in a simple manner. Now! one can show that it is impossible
U , j a w =. A) to construct a utility function such that, in the general case, different strategies
UA(AW ) =
u2 ( A W <  A ) ' can be ordered according to their probable gain G,. Therefore, the concepts of
with U2  Ul < 0. loss probability, valueatrisk or probable gain cannot be accommodated naturally
The expected utility is then simply related to the loss probability: within the framework of utility functions. Still, the idea that the quantity which
is of most concern and that should be optimized is the valueatrisk sounds
perfectly rational. This is at least the conceptual choice that we make in the present
I monograph.
where h is an arbitrary coefficient that measures the pessimism (or the risk
II other hand, precisely focuses on the tails. Extreme events are considered as the
true source of risk, whereas the small fluctuations contribute to the 'centre' of
the distributions (atid contribute to the volatility) can be seen as a background
aversion) of the operator. A rather natural procedure would then be to look for noise, inherent to the very activity of financial markets, but not relevant for risk
the optimal portfolio which maximizes the risk corrected return mi,. However7 this  assessment.
optimal portfolio cannot be obtained using the standard utility function formalism. From a theoretical point of view, this definition of risk (based on extreme
For example, Eq. (3.29) shows that the object which should be maxiinized is not events) does not easily fit into the classical 'utility function' framework. The
inT  h m but rather mT  D T / 2 w o . This amounts to comparing the average minimization of a loss probability rather assumes that there exists welldefined
profit to the square of the potential losses, divided by the reference wealth scale wo, thresholds (possibly different for each operator) where the 'utility function'
a quantity that depends a priori on the pera at or.^ On the other hand, the quantity is discontinuous." The concept of 'valueatrisk', or probable gain, cannot be
mT  AJ'T~T is directly related (at least in a Gaussian world) to the valueatrisk naturally dealt with by using utility functions.
AvaR,cf. Eq. (3.16).
'' ~ I  p.
Maximizing G p is thus equivalent to ~ninimizing" t v a ~such that P v a =
"his cornparison is actually meaningful, since it corresponds to compaing the reference rvealth u;" to the
order of magiittide of the worst drawdown D / m , ef. Eq (3.24).
" Itfluctuations
is possible that the presence of these threshoids actually plays an i~nponantroie in the fact
are strongly nonGaurs~an.
that the price
i
3.2 Portfolios of uncorrelated assets to be, to a eerttiin extent, predictive for future piice llucturltions. It, however.
sometimes happens that the correlations between two assets change abruptly.
The aim of this section is to explain, in the very simple case where all assets that
can be mixed in a portfolio are uncorrelated, how the tradeoff between risk and
return can be dealt with. (The case where some correlations between the asset 3.2.1 Uncorrelated Gaussian assets
fluctuations exist will be considered in the next section). On thus considers a set of
M different risky assets XI.i = 1 , . . . , M and one riskless asset X o . The number Let us suppose that the variation of the value of the ith asset Xi over the time
of asset i in the portfolio is tzi, and its present value is x:. If the total wealth to interval T is Gaussian, centred around mi T and of variance Dl T. The portfolio
I p = {po.p I , . . . , p M }as a whole also obeys Gaussian statistics (since the Gaussian
be invested in the portfolio is W , then the tzi's are constrained to be such that
zEO is stable), The average return m, of the porlfolio is given by:
n,xs = W . tVe shall rather use the weight of asset i in the portfolio, defined
as: pi = six,?/ W . which therefore must be normalized to one: zMo pi = 1. The I ,.' ..
p, 's can be negative (short positions). The value of the portfolio at time T is given
by: S = CiEo
M
nix, (7') = W zMo P i ~ i ( ~ ) / x In
D . the following, we will set the
i nr, = CPinzi= mo iC p i ( m i  moj,
initial wealth W to 1, and redefine each asset i in such a way that all initial prices
are equal to" x 1. (Therefore, the average return mi and variance Di that we will
1 M
where we have used the constraint Cid p, = 1 to introduce the excess return
consider below must be understood as relative, rather than absolute.) m i mo, as compared to the riskfree asset (i = 0). If the Xi's are all independent,
the total variance of the portfolio is given by:
One furthermore assumes that the average return mi is known. This hypothesis
is actually very strong, since it assumes for example that past returns can be used
as estimators of future returns, i.e. that time series are to some extent stationary.
However, this is very far from the truth: the life of a company (in particular
j
hightech ones) is very clearly nonstationary; a whole sector of activity can be (since Do is zero by assumption). If one tries to minimize the variance without any
booming or collapsing, depending upon global factors, not graspable within a constraint on the average return in,, one obviously finds the trivial solution where
purely statistical framework. Furthermore, the markets themselves evolve with all the weight is concentrated on the riskfree asset:
time, and it is clear that some statistical parameters do depend on time, and 1
have significantly shifted over the past 20 years. This means that the empirical 1
1
determination of the average return is difficult: volatilities are such that at least 1,
several years are needed to obtain a reasonable signaltonoise ratio, this time must On the opposite, the maximization of the return without any risk constraint leads
I to a full concentration' of the portfolio on the asset with the highest return.
indeed be large compared to the 'security time' f. But as discussed above, several
years is also the time scale over which the intrinsically nonstationary nature of the
markets starts being important.
One should thus rather understand m; as an 'expected' (or anticipated) future
1
I
More realistically, one can look for a tradeoff between risk and return, by
imposing a certain average return m p , and by looking for the less risky portfolio
(for Gaussian assets, risk and variance are identical). This can be achieved by
i introducing a Lagrarige multipli~rin order to enforce the constraint on the average
return, which includes some extra information (or intuition) available to the
return:
investor. These mi's can therefore vary from one investor to the next. The relevant  5
Its total variance is given by Djr; = I jZ.If all the Dl's are of the same order of
magnitude, one has Z 
MID; therefore, one finds the result, expected from the
CLT, that the variance of the portfolio is M times smaller than the variance of the
individual assets.
In practice, one often adds extra colistraints to the weights p,*in the fonn of
linear inequalities, such as pr 2 0 (no short positions). The solution is then
more involved, but is still unique. Geometrically, this amounts to looking for the
restrictiy of a paraboloid to an hyperplane, which remains a paraboloid. The
Risk efficient border is then shifted downwards (Fig. 3.6). A much richer case is when
Fig. 3.6. 'Efficient frontier' in the returdrisk plane m,, D p . In the absence of constraints. the constraint is nonlinear. For example, on futures markets, margin calls require
this line is a parabola (dark line). If some constraints are imposed (for example, that all the that a certain amount of money is left as a deposit, whether the position is long
weights p, should be positive), the bounda~ymoves downwards (dotted line). (pi =. 0) or short ( p i < 0). One can then impose a leverage constraint, such
that c:, \ f , where f is the fraction of wealth invested as a deposit. This
( p i=
and the equation for f : constraint leads to a much inore complex problem, similar to the one encountered
in hard optimization problems, where an exponentially large (in M) number of
quasidegenerate solutions can be found.'*
The least risky portfolio corresponds to the one such that 5 = 0 (no constraint on '' On this point, see S. Galluccio, J:P Bouchaud. hl Potrers. Portfolio optimisation, spinglasse? and random
matrix theory, Ph?;slca, A259, U Y 11998).and references therein.
3.2.2 Uttcorrelated 'powerlaw ' assets
As we have already underlined, the tails of the distributions are often nonGaussian.
In this case, the niininiization of the variance is not necessarily equivalent to an
optimal control of the large fluctuations. The case where these distribution tails
are powerlaws is interesting because one can then explicitly solve the problem of
the minimization of the valueatrisk of the full portfolio. Let us thus assume that
the fluctuations of each asset Xi are described, in the region of large losses, by a
probability density that decays as a powerlaw:
withan arbitrary exponent w, restricted however to be larger than 1, such that the
average return is well defined. (The distinction between the cases p < 2, for which
I I
the variance diverges, and p > 2 will be considered below). The coefficient A,
Risk provides an order of magnitude for the extreme losses associated with the asset i
(cf. Eq. (3.14)).
<
Fig. 3.7. Example of a standard efficient border = 0 (thick line) with four risky assets. As we have mentioned in Section 1.5.2, the powerlaw tails are interesting
If one imposes that the effective number of assets is equal to 2, one finds the subefficient
border drawn in dotted line, which touches the efficient border at r l , r2. The inset sho\~s because they are stable upon addition: the tail amplitudes A: (that generalize the
the effective asset number of the unconstrained optimal portfolio (5'' . 0) as a function of variance) simply add to describe the fartail of the distribution of the sum. Using
average return. The optimal portfolio satisfying M,f 3 2 is therefore given by the standard the results of Appendix C, one can show that if the asset X iis characterized by a
portfolio for returns between rl and r2 and by the Meff= 2 portfolios otherwise. tail amplitude A" the quantity p i x i has a tail amplitude equal to The tail
amplitude of the global portfolio p is thus given by:
Y2.The equation for p: then becomes:
At1 example of the modified efficient border is given in Figure 3.7. and the probability that the loss exceeds a certain level A is given by P = it:/Ag.
More generally, one could have considered the quantity Yq defined as: Hence, independently of the chosen loss level A , the minimization of the loss
probability P requires the minimization of the tail amplitude A ; ; the optimal
portfolio is therefore independent of A. (This is not true in general: see Section
3.2.4.) The minimization of As for a fixed average return mi,leads to the following
. ' . * . equations (valid if p > 1 ):
and used it to define the effective number of assets via Y, = h4;iq. It is interesting
to note that the quantity of missing information (or entropy) Z associated to the
very choice of the PJ'S is related to Yq when q + 1. Indeed, one has:
with an equation to fix 5:
(m,  mo)s
Approximating Y, as a function of q by a straight line thus leads to Z 2 Yz.
Ttre optimal loss piobability is then given by: * If on the contrary A >> JD,,T log(hl), then the forlnulae established in the
present section are valid elberlid~erz/A > 2.
Note that the growth of with M is so slow that the Gaussian CLT is
not of great help in the present case. The optimal portfolios in terms of the VaR are
Therefore, the concept of 'efficient border' is still valid in this case: in the plane not those with the minimal variance, and vice versa.
returniprobability of loss. it is similar to the dotted line of Figure 3.6. Eliminating
< from the above two equations, one finds that the shape of this line is given by
'P"cx ( m p f l z " ) " . The parabola is recovered in the limit p = 2. 3.2.3 'Exponential' assets
In the case where the riskfree asset cannot be included in the portfolio, the
optimal portfolio which minimizes extreme risks with no constraint on the average Suppose now that the distribution of price variations is a symmetric exponential
return is given by: around a zero mean value (m, = 0):
where a,:' gives the order of magnitude of the fluctuations of X i (more precisely,
and the cosresponding loss probability is equal to: I
! &!/ai is the RMS of the fluctuations.). The variations of the full portfolio p,
defined as SS = zEI pisxi, are distributed according to:
is a factor Mv'smaller than the individual probability of loss. i where we have used Eq. (1.50) and the fact that the Fourier transform of the
hbte uguirt that this res~lltis only valid $p. > 1. Y / L< 1, onefinds thut the risk increuses exponential distribution Eq. (3.54) is given by:
it.irll the nltntber ojussets M. In this cuse, when the number of assets is irzcreused, tize
PIobubiliryo j u n ~~tfuvoztruble event ulso increuses  indeed, for p < 1 this lurgest event
I
is so large flzut it donzinures over ull the others. The rnininziwtion of risk in this cuse leuds I
AL,~
ro p,,,,,, = 1, w/lere imln is the Eeust risky usser, in the sense that =rninj~y).
Now, using the method of residues, it is simple to establish the following expression
One should now distinguish the cases p < 2 and p > 2. Despite the fact that
for P ( S S ) (valid whenever the a , / p , are all different):
the asymptotic powerlaw behaviour is stable under addition for all values of p,
the tail is progressively 'eaten up' by the centre of the distribution for p > 2>since
the CLT applies. Only when p < 2 does this tail remain untouched. We thus again
1
recover the arguments of Section 1.6.4, already encountered when we discussed
the time dependence of the VaR. One should therefore distinguish two cases: if 1 The probability for extreme losses is thus equal to:
D, is the variance of the portfolio p (which is finite if p 2), the distribution ,
of the changcs SS of the value of the portfolio p is approximately Gaussian if
/ 6 S / .;. JD,T logjM), and becomes a powerlaw with a tail amplitude given by i
A: beyond this point. The question is thus whether the loss level A that one wishes
where a* is equal to the smallest of all ratios a,/ p i , and i* the corresponding value
to control is smaller or larger than this crossover value: of i. The order of magnitude of the extreme losses is therefore given by l / a * .This
* If A << J D , T log(M), the minimization of the VaR becomes equivalent to is then the quantity to be minimized in a valueatrisk context. This amounts to
the minimization of the variance, and one recovers the Markowitz procedure choosing the pi's such that min, ( a i l p if is as large as possible.
explained in the previous paragraph in the case of Gaussian assets. This minimization problem can be solved using the following trick. One can
write formally that the asset X , . The probability that the porti'olio p experiences of loss greater than A
is given, for large values of A, by:'?
This equality is true because in the limit p 4 m, the largest term of the sum Looking for the set of p,'s which minimizes the above expression (without
dominates over all the others. (The choice of the notation p is on purpose: see constraint on the average return) then leads to:
below). For p large but fixed, one can perfonn the minimization with respect to the
p, 's, using a Lagrange multiplier to enforce normalization. One finds that:
In the limit p ;
1 . moo, and imposing xi?, pi = I , one finally obtains:
This example shows that in the general case, the weights p: explicitly depend o n
ttze risk level A. If all the p,'s are equal, A'*' factors out and disappears from the
pi's.
Another interesting cuse is ttzut of weukfy nonGnussiun ussets, such tlrut the first
correction to the Gaussian distribution (proportional to the kurtosis ~i of the asset X i )
which leads to a* = zEl
a,.In this case, however, all a i / p l are equal to is enough to describe fuithfilly rtze nonGaussiun ejfecrs. The vuriunce of thefull port$olio
is given by D, = zMIP ? ~while
; the kurtosis is equul to: K~ = Xi=l p?~?~i/~i.
The
a* and the result Eq. (3.57) must be slightly altered. However, the asymptotic
pmbubility that the port$olio pluminets by un amount larger tiutn A is therefore given by:
exponential falloff; governed by a*, is still true (up to polynomial corrections:
cf. Section 1.6.4). One again finds that if all the ai's are comparable, the potential
losses, measured through I/ct*, are divided by a factor M.
Note that the optimal weights are such that pf oc ai if one tries to minimize
wlzere PC,is reluted to the errorfunction ( c t Secrwrz 1.6.3) und
the probability of extreme losses, whereas one would have found pl cc a? if the
goal was to minimize the variance of the portfolio, corresponding to p = 2 (cf.
Eq. (3.42)). In other words, this is an explicit example where one can see that
minimizing the variance actually increases the valueatrisk. To first order in kurtqsis, one thus finds that the optimal weights p: jwiiho~it,firing the
averuge return) are given by:
Fomally, us we huve noticed in Section 1.3.4, the exponentiul disrriburion corresponds
to the limit y + oo @ f apowerlull distn'butiotz: urz exponential decay is indeed more
rupid thun un~~powerlu~v. Techrzically, we have indeed estublished in this section thut tlze
minimi:afion of extreme risks in the exponerztial case is identical to the one obtuined itz the
previous section in the Eitnir p. + x (see Eq. (3.52)).
where h is artother ftmction, posiiive for large arguments, and r' i s f i e d by tile condition
xElp; = 1. Hence, the optima6 weights do depend on the risk lereiA, via the kurtosis
 . +. of the distributiorz. Furthermore, us could be expecred, rhe minimizatiojl of extreme risks
3.2.4 General case: op~malportfolioand VaR (*) lends fo rr reducriotz of tlze weights of the assets witlz o lurge kurtosis ~ i .
In all of the cases treated above, the optimal portfolio is found to be independent of
the chosen loss level A. For example, in the case of assets with powerlaw tails, the
3.3 Portfolios of correlated assets
minimization of the loss probability amounts to minimizing the tail amplitude A $ ,
independently of A. This property is however not true in general, and the optimal The aim of the previous section was to introduce, in a somewhat simplified
portfolio does indeed depend on the risk level A, or, equivalently, on the temporal context, the most important ideas underlying portfolio optimization, in a Gaussian
horizon over which risk must be 'tamed'. Let us for example consider the case world (where the variance is minimized) or in a nonGaussian world (where
where all assets are powerlaw distributed, but with a tail index pi that depends on l 3 The following expression is valid only when the subiendittg corrections to Eq. 13.47) can safely be neelected
thc quantity of interest is the valueatrisk). In reality, the fluctuations of the of the matrix C,,. whereas 0 is the orthogoniil
that the D,,'s are the i.igrrt~~zlues
different assets are often strongly correlated (or anticorrelated). For example, an nlatrix allowing one to go frorn the set of assets 1 ' s to the set of explicative factors
increase of shortterm interest rates often leads to a drop in share prices. All the eu.
stocks of the New York Stock Exchange behave, to a certain extent, similarly. The fluctuations SS of the global portfolio p are then also Gaussian (since 6S is a
These correlations of course modify completely the composition of the optimal
portFolios, and actually make diversification more difficult. In a sense, the number
weighted sum of the Gaussian variables e,), of mean 171 ,, = c
I pi (mi  m o ) +rng
and variance:
of effectively independent assets is decreased from the true number of assets M.
3.3.1 Correlated Gaussianfluc&ations The minimization of D, for a fixed value of the average return rn, (and with the
Let us first consider the case where all the fluctuations 6xi of the assets X i are possibility of including the riskfree asset X o ) leads to an equation generalizing Eq.
Gaussian, but with arbitrary correlations. These correlations are described in terms (3.38);
of a (symmetric) correlation matrix C i j ,defined as: M
This means that the joint distribution of all the fluctuations 6x1, Sx2,. . . , SXMis which can be inverted as:
given by;
< M
p; =  zC, 1 (pn;  m0)
P(6.xl, 6x2,. . . , 6xM) o( exp ~ ( s x
ij J
; pnj)(C')ij(~xj  m j ) , (3.68)
The ii~inimizationof large risks also requires, as in the Gaussian case detailed
above, the knowledge of the correlations between large events, and therefore an
adapted measure of these correlations. Now, in the extreme case p. < 2, the On the contrary, the quantity ii,, is distributed as a powerlaw tvith a n exponent
covariance (as the variance), is infinite. A suitable generalization of the covariance p 12 (cf. Appendix C):
is therefore necessary. Even in the case p 2, where this covarianceis a priorz
finite, the value of this covariance is a mix of the correlations between large
negative moves, large positive moves, and all the 'central' (i.e. not so large) events.
Again, the definition of a 'tail covariance', directly sensitive to the large negative
events, is needed. The aim of the present section is to define such a quantity, which Hence, the variable nijgets both 'nondiagonalkontributions naband 'diagonal'
is a natural generalization of the covariance for powerlaw distributions, much ones n,,. For nij + m, however, only the latter survive. Therefore nij is a power
as the 'tail amplitude'is a generalization of the variance. In a second part, the law variable of exponent p/2, with a tail amplitude that we will note A:", and
minimization of the valueatrisk of a portfolio will be discussed. an asymmetry coefficient Bij (see Section 1.3.3). Using the additivity of the tail
amplitudes, and the results of Appendix C, one finds:
'Tail covariance' 1
Let us again assume that the 6x,'s are distributed according to: !
and
A natural way to describe the correlations between large events is to generalize the
decomposition in independent factors used in the Gaussian case, Eq. (3.69) and to
write:
M
In the limit p. = 2, one sees that Ihe quantity reduces to the standard
covariance, provided one identifies A: with the variance of the explicative factors
where the e, are itz~lepetzdmtpowerlaw random variables, the distribution of 19,. This suggests that c:l2 is the suitable generalization of the covariance
which is: for powerlaw variables, which is constructed using extreme events only. The
expression Eq. (3.85) furthermore shows that the matrix Oia = sign(Oia)lOialpi2
allows one to diagonalize the 'tail covariance matrix' ~6'';" its eigenvalues are
Since powerlaw variables are (asymptotically) stable under addition, the decom given by the At's.
position Eq. (3.80) indeed leads for all y to correlated powerlaw variables Sx,. In summary, the tail covariance matrix c;!' is obtained by studying the
The usual definition of the covariance is related to the average value of Sx;6xj,. asymptotic behaviour of the product variable 6xi6xj3which is a powerlaw variable
which can in some cases be divergent (i.e. when p ;c 2). The idea is then to study of exponent p/2. The product of its tail amplitude and of its asymmetry coefficient
directly the characteristic function of the product variable nij = 6xi8,xj." The is our definition of the tail covariance C;".
'' Other generaiirarionc of the covariance ha:,* been proposed in rhe conrext of U v y processes, such as the
'covarialion' [Sarnorodnitsky and Taqqu]. " Note that 0 = 0 for & =2
Ol~iitlia(portfolio The proble~nof the minimization of the valueairisk in a strongly notlGausbian
It is now possible to find the optimal portfolio that minimizes the loss probability. wolId. where distributions behave asymptotically as powerlaws, and in the pres
The fluctuations of the portfolio p can indeed be written as: ence of correlations between tail events, can thus be solved using a procedure rather
similar to the one proposed by Markowitz for Gaussian assets.
?
Due to our assumption that the e, are symmersic powerlaw variables, SS is also a 1
1
3.4 Optimized trading (*)
In this section, we will discuss a slightly different problem of portfolio optimiza
symmetric powerlaw variable with a tail amplitude given by:
tion: how to dynamically optimize, as a function of time, the number of shares of a
I
I given stock that one holds in order to minimize the risk for a fixed level of return.
I
We shall actually encounter a similar problem in the next chapter on options when
I the question of the optimal hedging strategy will be addressed. In fact, much of the
In the simple case where one tries to minimize the tail amplitude A$ without any notations and techniques of the present section are borrowed from Chapter 4. The
constraint on the average return, one then finds:16 optimized strategy found below shows that in order to minimize the variance, the
timedependent part of the optimal strategy consists in selling when the price goes
up and buying when it goes down. However, this strategy increases the probahility
of very large losses!
, pJ) 1
where the vector Vi = s i g n ( ~ 7 = Oj, c;!,
O j , p j IF'. Once the tail covari
We will suppose that the trader holds a certain number of shares 4, (x,), where
4depends both on the (discrete) time t, = nt? and on the price of the stock x,,. For
ance matrix C@l2is known, the optimal weights pr are thus determined as follows:
the time being, w e assume that the interest rates are negligible and that the change
(i) The diagonalization of CP/' gives the rotation matrix 0 ,and therefore one of wealth is given by:
can construct the matrix O = sign(6)l 6 12/p (understood for each element
of the matrix), and the diagonal matrix A.
(ii) The matrix OA is then inverted, and applied to the vector (<!/p)i. This
gives the vector c*
=<'/p(~A)'i. where Sxk = xktl  .q.(See Sections 4.1 and 4.2 for a more detailed discussion of
(iii) From the vector ?* one can then obtain the weights pr by applying the Eq. (3.91).) Let us define the gain G = ( A W x )as the average final wealth. and the
matrix inverse of 0 to the vector sign(V*)/ V* / ''(w'): risk R
' as the variance of the final wealth:
(iv) The Lagrange multiplier 5' is then determined such as pf = 1. xE1
Ruther synlbolicully, one thus has:
The question is then to find the optimal trading strategy 4;(xk); such that the risk
R is minimized, for a given value of G. Introducing a Lagrange multiplier <.one
thus looks for the (functional) solution of the following equation:"
In the cuse p  2, one recovers the pre\'iousprescription, since in thur case: 0 A = 0 D =
CO, ( 0 ~ ) =~ 0lcl
' , and ot = 0', from which one gets:
Now, we further assume that the price increments Sx,, are independent random
variables of mean m s and variance Dt.Introducing the notation P ( x , kjxo,0) for
which coirzcides with Eq. (3.74).
the price to be equal to s at time k t , knowing that it is equal to so at time t = 0,
l 6 If one wants to impose a nonuro average return, one should replace c ' / ~by <'lp+
< m i / @ , here is
determined such as to give to m p the required value. The following equation resiilti from a functional minimization of the risk. See Section 4.4.3 for funher details
one has: To first order, the equation on 4: reads.
( X ,e ) x  X I
where the factorization of the average values holds because of the absence of
correlations between the value of x k (and thus that of &), and the value of the
increment 6xk.The calculation of (AW;) is somewhat more involved, in particular
2

40
/ P(.T'. ~ ' I X ~ , O ) Pklx',
P ( x , kIx0, 01 ke
dx'
X X'
+ nn ~ ~ / / " P ( ~ ~ . Y I X ~ , O ) P ( ~ , ~ ~ ~ k
k=O e=o
2dx
, ~. ?) @ ~dx'( ~ ~ ~ $ ~ ( ~ )
I Therefore, one finally finds:
SlGoT Cho
Taking the derivative of P'  (9' with respect to Qk(x), one finds: 4kI (.x) =  (I  xo) .
D 20
I
This equation shows that in order to minimize the risk as ~rleasuredby the variance,
a trader should sell stocks when their price increases, and buy more stocks when
+ mr P b . klxo. 0)
N 1
+k+l
/" P ( 2 . Y lx. k)&(xl)
x'  X
ek
dx '
1
their price decreases, proportionally to m(x  x o ) The value of
that:
is fixed such
Setting this expression to zero gives an implicit equation for the optimal strategy ! which leads to C1 = q (plus order 11: corrections). However. it can be shown that
4:. A solution to this equation can be found, for m small, as a power series in 177. this strategy increases the VaR. For example, if the increments are Gaussian, the
Looking for a reasonable return means that G should be of order nz. Therefore we left tail of the distribution of A Wx (corresponding to large losses), using the above
+< strategy, is found to be exponential, and therefore much broader than the Gaussian
set: G = GomT, with T = N s , and expand 4" and 1 as: I expected for a tuneindependent strategy:
and { = 50+  +Cl
rn2 nr
Inserting these expressions in Eq. (3.96) leads, to zeroth order, to a time
independent strategy:
1 3.5 Conclusion of the chapter
i1 In a Gaussian world, all measures of risk are equivalent. Minimizing the vaxiance or
the probability of large losses (or the valueatrisk) lead to the same result, which
The Lagrange multiplier Co is then fixed by the equation: '
is the family of optimal portfolios obtained by Markowitz. In the general case,
D however, the VaR minimization leads to portfolios which are different from those of
G = Gonr T = mTdO leading to Co =  and Cho = GO. (3.99) ''
Markowitz. These portfolios depend on the level of risk A (or on the time horizon
T i
4 relies on 3 certain degree of stability ( i n time) in the way rnarkets behave and
the prices evolve.' Let us thus assume that one can determine (using a statislical
Futures and options: fundaniental concepts analysis of past time series) the probability density P ( a , rlxci,t o ) , which gives the
probability that the price of the asset X is equal to x (to within dx) at tinie r ,
knowing that ar a previous time to, [he price was equal to ac. As in previous
chapters, we shall denote as (0) the average (over the 'historical' probability
distribution) of a certain obsen~able0:
Les personrzes tron averties sont slljettes a se lazsser indliire en erreu~" As w,e have shown in Chapter 2, the price fluctuations axe somewhat correlated
for small time intervals (a few minutes), but become rapidly uncorrelated (but not
(Lod Raglan, 'Le tabou de I'inceste'. quoted by Boris Vian in L'automize d Pkkin.)
necessarily independent!) on longer time scales. In the following, we shall choose
as our elementary time scale r an interval a few times larger than the correlation
4.1 Introduction t i m e say r = 30 min on liquid markets. We shall thus assume that the correlations
of price increments on two different intervals of size t are negligible.' When
4.1.1 Aim of the chapter
correlations are small, the information on future movements based on the study
The aim of this chapter is to introduce the general theory of derivative pricing of past fluctuations is weak. In this case, no systematic trading strategy can be
in a simple and intuitive, but rather unconventional, way. The usual presentation, more profitable (on the long run) than holding the market indexof course, one
which can be found in all the available books on the subject: relies on particular can temporarily 'beat' the market through sheer luck. This property corresponds to
models where it is possible to construct riskless hedging strategies, which replicate the eflcient nzczrket h y p ~ t h e s i s . ~
exactly the corresponding derivative product.3 Since the risk is strictly zero, there
It is interesting to translate this property into more formal terms. Let us suppose
is no ambiguity in the price of the derivative: it is equal to the cost of the that at time r,, = ~ z san, investor has a portfolio containing, in particular, a quantity
hedging strategy. In rhe general case, however, these 'perfect' strategies do not
Q,(x,,) of the asset X, quantity which can depend on the price of the asset x,, =
exist. Not surprisingly for the layman, zero risk is the exception rather than the
x(r,,) at time t, (this stiategy could actually depend on the price of the asset for all
rule. Comspondingly, a suitable theory must include risk as an essential feature, previous times: @,,(x,. x ,  ~ ,xn2, . . .)). Between t,, and t,+i, the price of X varies
which one would like to mirzirnize. The present chapter thus aims at developing
by Ax,,. This generates a profit (or a loss) for the investor equal to 4,, ( x , , ) 6 ~ Note
,~.
simple methods to obtain optimal strategies, residual risks, and prices of derivative
that the change of wealth is not equal to S(@x) = Q S s + xSQ. since the second
products, which takes into account in an adequate way the peculiar statistical nature
tenn only corresponds to converting some stocks in cash, or vice versa, but not
of financial markets, as described in Chapter 2.
to a real change of wealth. The wealth difference between time t = 0 and time
4J.2 Trading strategies and eficient markets ' ,A weaker hypothesis is that the staristica! 'texturs' of [he markets ii.e. the shape oi the probab~lity
distributions) is stable over time, but that the paramete,; which fix the anlplitude of the iiuctuations can
In the previous chapters, we hase insisted on the fact that if the detailed prediction be time dependent see Sections 1.7,2.4 and 4.3.4.
of future market moves is probably impossible, its statistical description is a The presence of small correlations on larger time wales is liowe\.er difficulr ro exclude, in particular on
the scale of si\eral years, which might reAect econotnic cycles for example. Note also IhaI the 'volat~lit)'
reasonable and useful idea, at least as a first approximation. This approach only fluctuations exhibit longrange corrslatianscf. SectJon 2.4.
%e existence of successful systematic hedge funds. whlch have consistently produced returns higher than the
Unwamed people may easily be fooled
' Seer.g. [Hull. W~lmott,Baxter].
average for several years. suggests that some sort of 'hidden' correlations do exist, at least on certain markets
But if they exist these correlations must be small and therefore are not relevant for our main concerns: risk
4 hedging strategy 1s a trzding strategy allaiving one to reduce, and ioinetimes ehminate, the risk. control and option pricing.
in, = 7' = N r . due to the trading of asset X is. for zero interest rates, equal to: I
1 %i~/!ic~ii
iticJii~i.stiiiit oric hu?.s (tc,v/~.
.s(,//.s) ntle .$rocki f f h e e,~/x'fiecit1ili.f i~i(.~~t?tetrt
(rz..sp. iiegiiiist.). The n\e~ii::e pc.c?{ii is theti obi>ioiisl~
I S ,t)o.\j?ii~i~
gisc,n h~ {/rn,,/j > 0. \ti. ,jitr.tlrer
Ni I
assutne riicir rlle corre1arioti.s crrc .short miigeil, suck that only c;,' atid c,;:, are ~rnil;rtz>.
Tile unit tiine irltervill is theti iiie ubot'e correlariotl riffleT . I f rhis s'ft(iiegyI S ii~eiiiiiit.iilx
rile titrie T = N T ,the avelzige projii is gii~enby:
Since the trading strategy Qn(x,,) can only be chosen before the price actually
changes, the absence of correlations means that the average value of A W x (with
respect to the historical probability distribution) reads:
(A%',) = A1a([6s()where a  (4.8)
NI iVl I (a does nor depend on n for a stationaq process). With ypical val~tes,r = 30 ir?itr,
(jsxl) 2 lo' xo, and a ?f about 0.1 ( ~ Section
f 2.2.21, the average profit would reach
50% aann~~al!~ Hence, the presence of correlatio~rs(even rather weak) allows in prir~cipie
where we have introduced the average return 6 of the asset X, defined as: I one to nuke rather substantial gains. Nowevel; one should remember that some transactiot~
costs are always present in some fonn or other lfbr exantple the bidask spread is afon?i
of transnction cost). Since our ussumption is that the correlation tirne is equal to r , it
means thaf the frequency with which our trader has to yip' his strategy (i.e. 4 + 4)
The above equation (4.3) thus means that the average value of the profit is fixed is rI. One sholrld thus subtract from Eq. (4.8)a term on the order of Nvxa, x~herev
represents the fractional cost o f a transaction. A v e y small value of the order of lo'
by the average return of the asset, weighted by the level of investment across the is thus enough to cancel completely the projt resulting from the observed correlations
considered period. We shall often use, in the foilowing, an additive model (more on the markets. Interestingly enough, the 'basis point' (109 is precisely the order oj
adapted on short time scales, cf. Section 2.2.1) where 6xn is rather written as Sx, = magnitude of the friction,faced by traders with the most direct access to the markets. More
rInxo. Correspondingly, the average return over the time interval T reads: r r t i = generally, transaction costs allow the existence of correlatiotzs, and thus of 'theoretical'
ineficiencies ofthe markets, Tire strerzgtiz ofthe 'allowed' correlations on a certain time T
(Sx) = h T x o . This approximation is usually justified as long as the total time is on the order ofthe transaction costs u divided by the volatility ojthe asset on tire scale
interval T corresponds to a small (average) relative increase of price: 6 T << 1. We of T .
will often denote as m the average return per utzit rime: m = m / T .
At this stage, it would appear that devising a 'theory' for trading is useless: in the
Trading in fhe presence of ternpoml correlations
It is interesritzg to investigate the case where correlations are not zero. For simplicitj we
I absence of correlations, speculating on stock markets is tantamount to playing the
roulette (the zero playing the role of transaction costs!). In fact, as will be discussed
shall assume that the JIuctzcations 6xn are stationar). Gaussian ~lariableso f zero r7~ean in the present chapter, the existence of riskless assets such as bonds, which yield
( ? ? T I = 0). The correlation &function is then given b y (Sx,,Gxk) = Cnk.The c,G~'s are
a known return, and the emergence of more complex financial products such as
the elenletits of the matrix inverse of C. [f one biotvs the sequence of past increments
6.~0,. . . , 6 ~ ,  ~the
, distribution of the next 6x,, conditioned to such arz obseivatiotz is futures contracts or options, demand an adapted theory for pricing and hedging.
simply given by: This theory turns out to be predictive, even in the complete absence of temporal
c,,"  Itl,]', correlations,
P [tix,) = ,Z/ exp [AX, (4.5)
2
The time evolution of W,, is due both to the change of price of the asset X, and to
the fact that the bond yields a known return through the interest rate r:
if the price of X is xo at time t . 0. This price, that can we shall call the 'Bachelier'
price, is not satisfactory since the seller takes the risk that the price at expiry x ( T )
ends up far above (x(T)), which could prompt him to increase his price above
On the other hand, the amount of capital in bonds evolves both mechanically,
F.0.
through the effect of the interest rate ( + B , , p ) , but also because the quantity of
Actually, the Bachelier price FB is not related to the market price for a simple
stock does change in time (#, , 4,i+1), and causes a flow of money from bonds
reason: the seller of the contract can suppress his risk completely if he buys now the
to stock or vice versa. Hence:
underlying asset X at price xo and waits for the expiry date. However, this strategy
is costly: the amount of cash frozen during that period does not yield the riskless
interest rate. The cost of the strategy is thus xoerT, where r is the interest rate per
unit time. From the view point of the buyer, it would certainly be absurd to pay Note that Eq. (4.1 1) is obviously compatible with the following t u a equations. The
more than xoerT,which is the cost of borrowing the cash needed to pay the asset solution of the second equation reads:
right away. The only viable price for the forward contract is thus 3 = xoerT $ FB,
and is, in this particular case, completely unrelated to the potential moves of the
asset X!
At1 elementary argument thus allows one to know the price of the forward Plugging this result in Eq. (4.11), and renaming k  1 + k in the second part of
contract and to follow a perfect hedging strategy: buy a quantity 4 = 1 of thg the sum. one finally ends up with the following central result for Wn:
underlying asset during the whole life of the contract. The aim of next is
nI
to establish this rather trivial result. which has not required any maths, in a much
more sophisticated way, The importance of this procedure is that one needs to
In practice 'futures' contracts are more common than forwards. While fonvards are overthecounter contracts,
futures are traded on organized markets For forwards there are typically no paymenss from either side before
+
with $; = ylk(l p)n"'. This last expression has an intuitive meaning: the
the expiration date whereas futures are markedtomarket and compensated every day, meaning that payments gain or loss incurred between time k and k + 1 nlust include the cost of the
are made dai!y by one side or the othsr to hnnz the ralus of the conrract back to zero.
hedging strategy pxk; furthermore, this gain or loss must be forwarded up to
Note that if the contract was lo k paid no\< 1:s price would k exactly equal to that of the underlying asset
I'
(barring the nsk of delivery default) time rr through interest rate effects, hence the extra factor ( I +
p)""". Another
useful way to write and understand Eq. (4.15) is to introduce the discounted prices
.rk = xi,( 1 + p )  k . One then has: I n rtre ctrse ofir stock iliiit pciys a cotlsrcirrr diiviciend rate 6 = d i per inrm.al oj'ritne T , riir
globai icenlti7 balarice is obvioiisl? circit~gedinfo:
The effect of interest rates can be thought of as an erosion of the value of the money
itself. The discounted prices ikare therefore the 'true' prices, and the difference It is easy to redo tile above calculatio~zsill this case. One finds that the riskless strategy is
Xxi!  Zi the 'true' change of wealth. The overall factor (1 + p)" then converts this now to hold:
true wealth into the current value of the money.
The global balance associated with the forward contract contains two further
terms: the price of the forward F that one wishes to determine, and the price of the
underlying asset at the delivery date. Hence, finally: stocks.,The wealth balance breaks even ifthe price of the forward is set to:
which again could have been obtained using a simple no arbitrage argument ojthe gpe
Since by identity 2~ = xk=O
Ni 
(xk+,  Xk) + XQ, this last expression can also be presented below.
written as:
Variable interest rates
In reality, the interest rate is not constant in time but rather also t~ariesrandonrly. More
precisel>>,as explained in Section 2.6, at any instant of time the whole interest rak cunle
for diferenr maturities is known, bi~tevolves with time. The generalization of the global
balance, as given by formzlla, Eq. (4.17). depends on the mturify of the bonds included
in the portfolio, and thus on the whole interest rate curve. Assuming that only shortterrn
4.2.3 Riskless hedge bonds are included, yielding the (timedependent) 'spot' rate pk, one has:
In this last formula, all the randomness, the uncertainty on the future evolution of
the prices, only appears in the last term. But if one chooses Qk to be identically
equal to one, then the global balance is completely independent of the evolution of
the stock price. The final result is not random, and reads:
Now, the wealth of the writer of the forward contract at time T = Nz cannot be, with: $: . & ~f=;:, (1 + pr). It is again quite obvious tlzar Izofdinga q~~airtity&=1
with certitude, greater (or less) than the one he would have had if the con~racLhad :
ofthe underlying asset leiids to zero risk in tile sense that thefluctuations of X &sappea?:
+
not been signed, i.e. Wo(1 p l N . If this was the case?one of the two parties would
5;
. However; the above strategy is not irnrnune to intere.st rate risk. The interesting complexin
of interest mte problems (such as rite pricing and hedging of interesr rate derivatives)
be losing money in a totally predictable fashion (since Eq. (4.19) does not contain
comes born the fact that orre nlay choose bonds of arbitrary maturity to construct tile
any unknown term). Since any reasonable participant is reluctant to give away his hedging sttategp In the present case, one may take a bond of nraturip equal to that ojthe
rnoney with no hope of return, one concludes that the forward price is indeed given forward. In this case, risk disappears entirely, arrd the price offhefonvard reads:
by:
3 = xo(l +P)N r\ xOerT# FB, (4.20)
*
which does not rely on any statistical information on the price of X ! where BiO, N ) stands for tiie valrde, at time 0, ofthe bond maturing at time N
4.2.4 Conclusion: global balartce arid arhitrqe above that if one forgets the term correspo~~dilig to the trading of the underlying
stock, one ends up with the intuitive, but wrong. Bachelier price, Eq. (4.10).
From the sinlple example of forward contracts, one should bear In mind the
following points, which are the key concepts underlying the derivative pricing
theory as presented in this book. After writing down the coinplete financial balance, 4.3 Options: definition and valuation
taking inta account the trading of all assets used to cover the risk, it is quite 4.3.1 Setti~zgthe stage
natural (at least from the view point of the writer of the contract) to deternine
A buy option (or 'call' option) is nothing but an insurance policy, protecting the
the trading strategy for all the hedging assets so as to minimize the risk associated
owner against the potential increase of the price of a given asset, which he will
to the contract. After doing so, a reference price is obtained by demanding that
need to buy in the future. The call option provides to its owner the certainty of not
the global balance is zero on average, corresponding to a fair price from the point
paying the asset more than a certain price. Symmetrically, a 'put' option protects
of view of both parties. In certain cases (such as the forward contracts described
against drawdowns, by insuring to the owner a minimum value for his stock.
above), the minimum risk is zero and the true market price cannot differ from
More precisely, in the case of a socalled "uropean' option." the contract is
the fair price, or else arbitrage would be possible. On the example of forward
such that at a given date in the future (the 'expiry date' or 'maturity') r = T, the
contracts, the price, Eq. (4.20), indeed corresponds to the abse~zceof arbitrage
owner of the option will not pay the asset more than x, (the 'exercise price', or
opportunitzes (AAO), that is, of riskless profit. Suppose for example that the price
'strike' price): the possible difference between the market price at time T, x ( T )
of the forward is below 3 = xo(l + p ) N . One can then sell the underlying asset
and s, is taken care of by the writer of the option. Knowing that the price of the
now at price ro and simultaneously buy the forward at a price F' < 3, that
underlying asset is xo now (i.e. at t = 0), what is the price (or 'premium') C of
must be paid for on the delivery date. The cash xo is used to buy bonds with
the call? What is the optimal hedging strategy that the writer of the option should
a yield rate p. On the expiry date, the forward contract is used to buy back the
follow in order to minimize his risk?
+
stock and close the position. The wealth of the trader is then xo(l PI""'  F1,
The very first scientific theory of option pricing dates back to Bachelier in
which is positive under our assumption and furthermore fully determined at time
1900. His proposal was, following a fair game argument, that the option price
zero: there is profit, but no risk. Similarly, a price ;C' > F would also lead to
should equal the average value of the payoff of the contract at expiry. Bachelier
an opportunity of arbitrage. More generally, if the hedging strategy is perfect, this
calculated this average by assuming the price increments Sx,, to be independent
means that the final wealth is known in advance. Thus, increasing the price as
Gaussian random variables, which leads to the formula, Eq. (4.43) below. However,
compared to the fair price leads to a riskless profit for the seller of the contract,
Bachelier did not discuss the possibility of hedging, and therefore did not include
and vice versa. This AAO principle is at the heart of most derivative pricing
in his wealth balance the term corresponding to the trading strategy that we
theories currently used. Unfortunately. this principle cannot be used in the general
have discussed in the previous section. As we have seen, this is precisely the
case. where the minimal risk is nonzero. or when transaction costs are present
term responsible for the difference between the forward 'Bachelier price' FB (cf.
(and absorb the potential profit, see the discussion in Section 4.1.2 above). When
Eq. (4.10)) and the true price, Eq. (4.20). The problem of the optimal trading
the risk is nonzero, there exists a fundamental ambiguity in the price, since
strategy must thus, in. principle, be solved before one can fix the price of the
one should expect that a risk premium is added to the fair price (for example,
option.12 This is the problem that was solved by Black and Scholes in 1973, when
as a bidask spread). This risk premium depends both on the riskaverseness
they showed that for a continuoustime Gaussian process. there exists a perfect
of the market maker, but also on the liquidity of the derivative market:if &e
strategy, in the sense that the risk associated to writing an option is strictly zero. as
price asked by one market maker is too high, less greedy market makers will
is the case for forward contracts. The determination of this perfect hedging strategy
make the deal. This mechanism does however not operate for 'overthecounter'
allows one to fix completely the price of the option using an AAO argument.
operations (OTC. that is between two individual parties, as opposed to through an
Unfortunately, as repeatedly discussed below. this ideal strategy only exists in
organized market). We shall come back to this important discussion in Section 4.6
below. .
" Many other types of optlons ex~st:American', 'Asian', 'Lookback', 'Digirals', cic Lee [Wilrnottl lV< will
discuss some of them in Chapter 5.
Let us emphasize that the proper accounting of all financial elements in the ''in practice, however, !he influence of rhe trading strategy on the price (bur no: on the nsk!) is quite sniall. \ee
wealth balance is cntcial to obtain the correct fair price. For example, we have seen below.
a continuot~stime,Gaussian world," that is. i f the rnarket correlation tirile was on the stock. the hedging strategy is less cosily since oniy a sinall fraciion ,f of the
infinitely short. and if i ~ odiscontinuities in the market price were allowed  both value of the stock is required as a deposit. In the case where J' = 0, the wealth
assumptions rather remote from reality. The hedging strategy cannot. in general, be balance appears to take a simpler form, since the interest rate is not lost while
perfect. However, an optimal hedging strategy can always be found. for which the trading the underlying forward contract:
risk is minimal (cf. Section 4.4).This optimal strategy thus allows one to calculate
the fair price of the option and the associated residual risk. One should nevertheless
bear in mind that, as emphasized in Sections 4.2.4 and 4.6, there is no such thing
as a unique option price whenever the risk is nonzero. However, one can check that if one expresses the price of the forward in terms of the
Let us now discuss these ideas more precisely. Following the method and underlying stock, Eqs (4.27) and (4.28) are actually identical. (Note in particular
notations introduced in the previous section, we write the global wealth balance that FN = ~ ~ 1 . )
A crucial difference with forward contracts comes from the nonlinear nature of
the payoff, which is equal, in the case of a European option, to y ( x N ) =; max(xN Taking 7' = 100 days, a daily volatility of a = 1%, an annual return of m = 5%.
s,.0). This must be contrasted with the forward payoff, which is linear (and equal and a stock such that xo = 100 points. one gets:
to xN). It is ultimately the nonlinearity of the payoff which, combined with the
nonGaussian nature of the fluctuations, leads to a nonzero residual risk. ,,IDT
2n h (1 points
m T 2: 0.67 points.

2
An equation for the call price C is obtained by requiring that the excess return
+
due to writing the option, A W = Wh.  WO(l P ) N > is zero on average: In fact, the effect of a nonzero average return of the stock is much less than
the above estimation yould suggest. The reason is that one must also take into
account the last term of the balance equation, Eq. (4.27), which comes from the
hedging strategy. This term corrects the price by an amount (4)11zT. Now, as
we shall see later, the optirnal strategy for an atthemoney option is to hold, t311
This price therefore depends, in principle, on the strategy = @:(I p)Nkl. + 4
average (4) = stocks per option. Hence, this term precisely compensates the
This price corresponds to the fair price, to which a risk premium will in general be increase (equal to mT/2) of the average payoff, (max(x(T)  x,, 0)). This strange
added (for example in the form of a bidask spreadj. compensation is actually exact in the BlackScholes model (where the price of
In the rather common case where the underlying asset is not a stock but a forward
l4 We again neglcct the diEercnre beiiveen Gaussian and lognormal stati.itics in the following order of
l 3 A pr.opeity also shared by a dlscrete binomial evolution of pnces. where at each time step, the price increment magnitude estimate. See Sections 1.3.2. 2.2.1, Eq. (4.43) and Figure 4.1 below.
fix can only take two values, see Appcndin E. However, thc risk is nonzero as soon as the number of possible l5 An option is called 'atthemoney' if the strike p n i e is equal to ihe current price of the underlying stock
price changes euceeds two. ( x s = xg). 'outofthemoney' if xi > xg and 'inthemonsy' if x, r xg.
142 b ' i ~ ! i ( r u~ <i ~ o// ) r ! i ~ i l . sf: i ~ t ~ c / i i i t i ~ ~ cf i! ~
f ti ii/~ , t , j ~ f . i
the option turns out to be conipletoly indepelldent of the value of i l l ) hut only G.After changing variables to r i .roil i p ) , v e ' ,the fr~rmula,Eq. (4.33).is
approximate in Inore general cases. However, i t is a very good first approximation troi~sforn~ed
into:
to neglect the dependence of the option price in the average return of the stock: we
C = ,X~J[S(~' eV~)P~;(~jd>. (4.33)
shall come back to this point in detail in Section 4.5.
The interest rate appears in two different places in the balance equation,
Q. (4.27): in fr~ontof the call premium C , and in the cost of the trading strategy. where y, log(x,/xo[l + and PA.(?) zz P ( y , NIO, 0 ) . Note that y, involves
For p = r r corresponding to 5% per year. the factor (1 f p ) N corrects the option the ratio of the strike price to the.fonttard price of the underlying asset, xo[l p ] ' " . +
price by a very small amount, which, for 7' = 100 days. is equal to 0.06 points. +
Setting xk = xo(l p)ke''k, the evolution of the ~ ~is given ' s by:
However, the cost of the trading strategy is not negligible. Its order of magnitude
is  (@)xorT:in the above numerical example, it corresponds to a price increase
of ?jof a point (16% of the option price), see Section 5.1.
where thirdorder terms (q3,r12P, . . .) have been neglected. The distribution of the
In this case, the hedging strategy disappears Ero~nthe price of the option, which IS
given by the following 'Bache1ier.like formula, generalized to the case where the
increments are not necessarily Gaussian:
The BlackScholes model also corresponds to the limit where the elementary time
  .
+ p)"
&"
C = (1 (max (sN  x,. 0 ) )  interllal T goes to zero, in which case one of course has N = T / s 4 m. As
This equation can also be derived directly from the BlackScholes price, Eq. (4.391,
in the small maturity limit, where relative price variations are small: xN/xO i <<
where xi = l o g ( ~ , / x ~) r T i:a 2 T / 2 and % , ( I ( ) is the cumulative normal 1, allowing one to write y = log(x/xu) . (x  x0)/xa As emphasized in
distribution defined by Eq. (1.68). The way CBS varies with the four parameters Section 1.3.2, this is the limit where the Gaussian and lognormal distributions
.yo,,xS.T and a is quite intuitive, and discussed at length in all the books which become very similar.
deal with options. The derivatives of the price with respect to these parameters are The dependence of CG(xO, x,, T ) as a function of x, is shown in Figure 4.1,
now called the 'Greeks', because they are denoted by Greek letters. For example, where the numerical value of the relative difference between the BlackScholes
the larger the price of the stock xo, the more probable it is to reach the strike price and Bacheiier price is also plotted.
at expiry, and the more expensive the option. Hence the socalled 'Delta' of the In a more general additive model, when N is finite, one can reconstruct the
option ( d = aC/axo) is positive. As we shall show below, this quantity is actually full distfibution P(x, Nixo, 0) from the elementary distribution PI(Sx) using
directly related to the optimal hedging strategy. The variation of A with Xa is noted the convolution rule (slightly modified to take into account the extra factors
I' (Gamma) and is defined as: r = ad/axa. Similarly, the dependence in maturity (1 + p ) ~  & ~In. Eq. (4.40)). When N is large, the Gaussian approximation
T and volatility a (which in fact only appear through the combination a f i if IT is P (x, N lxo, 0 ) = PG(x: M 1x0(1 + 0 ) becomes accurate, according to the CLT
small) leads to the definition of two further Greeks': O = 2C/aT = 2C/2r < 0 discussed in Chapter 1.
and 'Vega' V = aC/aa > 0, the higher a f i , the higher the call premium. As far as interest rate effects are concerned, it is interesting to note that both
formulae, Eqs (4.39) and (4.42), can be written as e" times a function of xoerT.
Bachelier's Gaussian limit
that is of the forward price Fo. This is quite natural, since an option on a stock and
Suppose now that the price process is additive rarher than multiplicative. This on a forward must have the same price, since they have the same payoff at expiry.
means that the increments Axk themselves, rather than the relative increments, As will be discussed in Chapter 5, this is a rather general property, not related to
should be considered as independent random variables. Using the change of any particular model for price fluctuations.
variable _tk = (1 + in Eq. (4.31), one finds that .x* can be written as:
Dyncimic equationfor the option price
It is easy to show directlj.tI~attile Gaussiari distribution:
When N is large, the difference XA, xo(l + p)": becomes, according to the CLT,
a Gaussian variable of mean zero (if m = 0) and of variance equal to:
obeys the d[fi~ision
eyliurjon (or hear equrction!:
P=O  .
where D T is the variance of the individual increments, related to the volatility
through: D = a'xi. The price, Eq. (4.33), then takes the following form:
PG(x.Olxo,O) = 6(x  xo). (4.46)
On the other hand, since PG(.cT 1x0,0 ) only depends on the drfereizce x  xo, one hus:
The price formula written down by Bachelier corresponds to the limit of short ~'PO(X, Tlxo, 0 )  a 2 P ~ ( xT/x0,
. 0)
 (4.47)
mariirie options, where all interest rate effects can be neglected. In this limit. the ax2 ax;
4.3.4 Real option prices, volatility smile a~id'implied' kurtosis
Srationary di.~tribritionand rhe smile czme
We shall now come back to the case where the distribution of the price increments
8xk is arbitrary, for example a TLD. For simplicity, we set the interest rate p to
zero. (A nonzero interest rate can readily be taken into account by discounting
the price of the call on the forward contract.) The price difference is thus the
sum of N = T I T iid random variables, to which the discussion of the CLT
presented in Chapter I applies directly. For large N,the terminal price distribution
P ( x , N ( x o ,0 ) becomes Gaussian, while for N finite, 'fat tail'cffects are important
and the deviations from the CLT are noticeable. In particular, short maturity options
or outofthemoney options, or options on very 'jagged' assets, are not adequately
priced by the BlackScholes formula.
In practice, the market corrects for this difference empirically, by introducing
in the BlackScholes formula an ad hoe 'implied' volatility 27, different from
the 'true', historical volatility of the underlying asset. Furthermore, the value of
the implied volatility needed to price options of different strike prices x, and/or
maturities T properly is not constant, but rather depends both on T and x,. This
defines a "volatility surface' C(x,, T ) . It is usually observed that the Iarger the
difference between x, and xo, the larger the implied volatility: this is the socalled
'smile effect' (Fig.4.2). On the other hand, the longer the maturity T , the better
the Gaussian approximation; the smile thus tends to flatten out with maturity.
It is important to understand how traders on option markets use the Black
Fig. 4.1. Price of a call of maturity T = 100 days as a function of the strike price x,. in Scholes formula. Rather than viewing it as a predictive theory for option prices
the Bachelier model. The underlying stock is wort11 100, and the daily volatility is 1%. given an observed (historical) volatility, the BlackScholes formula allows the
The interest rate is taken to be zero. Inset: relative difference between the rhe lognormal, trader to translate back and forth between market prices (driven by supply and
BlackScholes price Css and the Gaussian. Bachelier price (CG),for the same values of the
parameters. demand) and an abstiact parameter called the implied volatility. In itself this
transformation (between prices and volatilities) does not add any new information.
 . *  ,
However, this approach is useful in practice, since the implied volatility varies
Taki~xgthe derivufive of Ey. ( 4 . 4 3 ) with respect to the niuturify T , one$finds much less than option prices as a function of maturity and moneyness. The
BlackScholes formula is thus viewed as a zeroth order approximation that
takes into account the gross features of option prices, the traders then correcting
for other effects by adjusting the implied volatility. This view is quite common
wit/? boundary condiiions, .for a zero nzuturit)~option: in experimental sciences where an a priori constant parameter of an approxi
mate theory is made into a varying effective parameter to describe more subtle
CG(xO.xi, 0) = maxIsu  x\.0). effects.
After c:haxgit~gvnliables s'+ x' xu, and using Ey. (2.69) of Section 1.6.3, one gers:
Fig. 4.2. 'Implied volatility' C as used by the market to account for nonGaussian effects.
The larger the difference between x, and xo, the larger the value of C : the curve has and
the shape of a smile. The function plotted here is a simple parabola, which is obtained
theoretically by including the effect of a nonzero kurtosis ~1 of the elementary price
increments. Note that for atthemoney options (x, = xo), the implied volatility is smaller
than the true volatility. the integrations aver u are readily performed, yielding:
['x;,;."]
where D 02x; and AC, = C,  C,,o. SC,,oixg, x,, T)= S a x 0 exp
One curt indeed transform the farmula
The effect of a nonzero kurtosis K , can thus be reproduced (to first order) by
a Gaussian pricing formula, but at the expense of using an effective volatility
+
T;(n,, T) = o So given by:
I' We also assume here chat Ole distribution is symmetrical (PI(&) = Pi(6x)l,which is usua:ly justified
cn s h o ~ ttime scales. IF this assu~nptionis nor adequate, one musr include ihe skeaness i3,
leadlng to an
aiymmeincal smile, cf. Eq. (4.56).
son I , , 
 1 roughly constallt during that period (see Fig. 4.4). Correspondingly, the assu~~lption
that the process is stationary is reasonable.
The agreement between theoretical and observed prices is however much less
convincing if one uses data from 1994, when the volatility of interest rate markets
has been high, The following subsection aims at describing these discrepancies in
greater details.
Fig. 4.3. Comparison between theoretical and observed option prices. Each point corre
sponds to an option on the Bund (traded on the LIFFE in London), with different strike where Plo is a clistribiltiori independent of k, and of width lfor example measured b j the
prices and maturities. The x coordinate of each point is the theoretical price of the option MAL)) nomalized to one. The absolute volutility of the asset is thtis proportional to yk.
(detemhed using the empirical terminal price distribution), while the y coordinate is the Note that if Plo is Gaussian and time is continuous, this model is known as the stochastic
market price of the option. If the theory is good, one should observe a cloud of points volatilir). Brownian marion.l9 However; this assumption is rzot needed, and one can keep
concentrated around the line y = x. The dotted line corresponds to a linear regression, and Plo arbztray.
gives y = 0 . 9 9 8 ~i 0.02 (in basis points units). Figure 4.4 s i ~ o rite~ s evolution of the scale y (Jilrered over 5 days in rile past) us a
function of time and cornpares it to the implied volatility C(x, = .wo) extmcted Jiom
atthemoney optio~zsduriiig the same period. One tlxus observes that the option prices
with K (T) = K ~ / NThis
. very simple formula, represented in Figure 4.2, allows one are ruther well tracked by adjusting the factor yk through a shortterm estzmate of the
volatility ofrhe ztizderlying asset. This means that the option prices primari!): re$ects tile
to understand intuitively the shape and amplitude of the smile. For example, for a quasitnsrarttaneous volatility of the underlying asset ifself; rather than an 'anticipated'
daily kurtosis of ~1 = 10, the volatility correction is on the order of So/o 21 17% average ~~olatility on the lije rime of the oprion.
for outofthemoney options such that x,  xo = 31/15T, and for T = 20 days. It is interesting to notice that the mere existence of volati1it;vfluctuations leads to a rron
zero kurtosis K X rf the asserjuctuarions (see Section 2.4). j'his kztr?osis has an anoma/ous
Note that the effect of the kurtosis is to reduce the implied atthemoney volatility
time dependence (cf:Eq. f2.i 7)), in agreenzent with tlze direct observation of the Izistor~icc~!
in comparison with its true value. kicrtosis (Fig. 4.6).On the other itand, an 'implied krtrtosis' cun be esrracred ,froin the
Figure 4.3 shows some 'experimental' data, concerning options on the Bund niarket price of options, using Eq. (4.58) as a fir to the empirical smile, see Figure 4.5.
(futures) contractfor which a weakly nonGaussian model is satisfactory {cf. Rentarkably enough, this intpiied kurtosis is irz close correspondence with the Ixistoriiai
kurtosis nore that Figure 4.6 does not require any further adjustable parameter:
Section 2.3). The associated option market is, furthermore, very liquid; this tends As a conclusion qf this section, it seems tlzat niarket operators have empi~icalljcorrected
to reduce the width of the bidask spread, and thus, probably to decrease the the BlackScl~oles.foritz~lato account for hvo distinct, but reluted eficts:
difference between the market price and a theoretical 'fair' price. This is not the Tile presence of market jumps, implyirig fat tailed distributions ( K > 0 ) of shortten72
3
case for OTC options, when the overhead charged by the financial institufion  price incrernerrts. This eflect is responsible for rhe volarilir) smile (and also, as ~ t sl~all
~ e
writing the option can in some case be very large, cf. Section 4.4.1 below. In discltss next,for tire fact that options are ris!q).
* The fact tlzilt the voiutility is not constant, but JIuctuntes in time, leading io iin
Figure 4.3, each point corresponds to an option of a certain maturity and strike unomulottsly slow decay of the kurtosis [slower than I/N) and, corresportding!\: to
price. The coordinates of these points are the theoretical price, Eq. (4.34), along the a norttrivial d e f o m t i o n of the smile with rrraturity.
x axis (calculated using the historical distribution of the Bund), and the observed
j8 This hypo~!iesr>can be justified by assunling that the ampiitude of the market moves is subordii~a[edto the
market price along the y axis. If the theory is good, one should observe a cloud volume of transacrions, which itself is obviously time dependent.
of points concentrated around the line ,y = x. Figure 4.3 includes points from the " In this context. the fact [hat the volatility is time dependent is called 'heteroskedas[icity'. ARCH modzls (,Auto
Regressl~eConditional Heteroskedasticity) and their relatives have heen iny1ented as a framework to model
first half of 1995, which was rather 'calm'. in the sense that the volatility remained buch effects. See Section 2.9.
Fig. 4.4. Time dependence of the scale parameter y , obtained from the analysis of the Fig. 4.5. Implied BlackScholes volatility and fitted parabolic smile. The circles corre
intraday fluctuations of the underlying asset (here the Bund), or from the implied volatility spond to all quoted prices on 26th April, 1995, on options of Imonth maturity. The error
of atthemoney options. More precisely, the historical determination of y comes from the bars correspond to an error on the price of i l basis point. The curvature of the parabola
daily average of the absolute value of the 5min price increments. The two curves are then allows one to extract the implied kurtosis K ( T )using Eq. (4.58).
smoothed over 5 days. These two determinations are strongly correlated. showing that the
option price mostly reflects the instantaneous volatility of the underlying asset itself.
fact that the correction to the option price induced by a nonzero return is important
to assess (this will be done in the next section), the determination of an 'optimal'
It is interesting to note that, through trial and ermrs, the market as a whole has evolved
to allow for such nontrivial statistical featuresat least on most acti~~ely
traded markets. hedging strategy and the corresponding minimal residual risk is crucial for the
This might be called 'market efficiency'; but contrarily to stock markets where it is dificrilt following reason. Suppose that an adequate measure of the risk taken by the writer
to judge whether the stock price is or is not the 'true' price (which might be a12 emply of the option is given by the variance of the global wealth balance associated to the
concept), oprion n~arketsofer a remarkable testing ground for this idea. If is also a nice operation, i.e.:*O
exanzple qfaduptntior~ofa population (the traders) to a complexnrtd hosfile envirotlmerrt.
which has faken place in a,few decades!  R=J'G~~. (4.60)
In summary, part of the information contained in the implied volatility surface As we shall find below, there is a special strategy +* such that the above quantity is
.E(xs, T) used by market participants can be explained by an adequate statistical reaches a minimum value. Within the hypotheses of the BlackScholes model, this
model of the underlying asset fluctuations. In pkicular, in weakly nonGaussian minimuin risk is even, rather surprisingly, strictly zero. Under less restrictive and
markets, an important parameter is the timedependent kurtosis, see Eq. (4.58).The more realistic hypotheses, however, the residual risk R*. actually
anomalous maturity dependence of this kurtosis encodes the fact that the volatility amounts to a substantial fraction of the option price itself. It is thus rather natural
is itself time dependent.  e . a*! for the writer of the option to try to reduce further the risk by overpricing the option,
adding to the 'fair price' a risk premium proportional to R*in such a way that the
probability of eventually losing money is reduced. Stated otherwise, option writing
4.4 Optimal strategy and residual risk being an essentially risky operation, it should also be, on average, profitable.
4.4.1 Znh.oduction Therefore, a market maker on option markets will offer to buy and to sell options
at slightly different prices (the 'bidask' spread), centred around the fair price C.
In the above discussion, we have chosen a model of price increments such that the
The amplitude of the spread is presumably governed by the residual risk, and is
cost of the hedging strategy (i,e. the term (ljSGSxk))could be neglected, which is
justified if the excess return m is zero, or else for short maturity options. Beside the '' The case ~vherc:a better measure of risk rs the loss probability or the valueatrisk will be discussed below.
4.4.2 A sinlple case
Let us now discuss how an optimal hedging strategy can be constructed, by
focusing first on the simple case where the amount of the underlying asset held in
the portfolio is fixed once and for all when the option is written, i.e. at r = 0. This
extreme case would correspond to very high trailsaction costs, so that changing
one's position on the market is very costly. We shall furthermore assume, for
simplicity, that interest rate effects are negligible, i.e. p = 0. The global wealth
balance, Eq. (4.28), then reads:
Nl
A W =Cmax(.x,~   x , , ~ ) + ( i , ~ S x k . (4.62)
k=O
In the case where the average return is zero ( ( 8 ~ ~=) 0) and where the increments
are uncorrelated (i.e. {6xk6x1)= DrSk.!), the variance of the final wealth, R2 =
{A W 2 ) (A W ) 2 reads:
,
I where R;is the intrinsic risk, associated to the 'bare', unhedged option ( 4 = 0):
The dependence of the risk on 4 is shown in Figure 4.9, and indeed reveals the
existence of an optimal value 4 = 4 " for which R takes a minimum value. Taking
the derivative of Eq. (4.63) with respect to 4,
Fig. 4.8. Histogram of the global wealth balance A W associated to the writing of the same one finds:
option, with a price still fixed such that ( A W) = 0 (vertical line), with the same horizontal
scale as in the previous figure. This figure shows the eeffectof adopting an optimal hedge
(4 = 4*), recalculated every halfhour, and in the absence of transaction costs. The RMS
R* is dearly smaller (= 0.28), but nonzero. Note that the distribution is skewed towards thereby fixing the optimal hedging strategy within this simplified framework.
d W < 0. Increasing the price of the option by XR*amounts to diminishing the probability Interestingly, if P ( x , Nixo, 0) is Gaussian (which becomes a better approximation
of losing tnoney (for the writer of the option), P ( d W < 0).
. "  3 . . as N = T / r increases), one can write:
1
 " l
Another way of expressing this idea is to use Rhas a scale to measure the
difference between the market price of an option CM and its theoretical price: DTLs JZZF
(X  xs)(x  xo) exp
[

'x~;;)~] *
We introduce again the distribution P(x, klxo,0 ) for arbitrary intermediate times
k . The strategy $f depends now on the value of the price xa. One can express the
terms appearing in Eq. (4.69) in the following form:
and
Fig. 4.9. Dependence of the residual risk R as a function of the strategy 4, in the simple
case where this strategy does not vary with time. Note that R is minimum for a welldefined
value of 4.
Hence, buying a certain fraction of the underlying allows one to reduce the risk ~ ~ , N ) that the average of 8x1;is restricted to
where the notation ( S X ~ ) ( ~ , ~ ) + ( means
those trajectories which start from point x at time k and end at point x' at time
associated to the option. As we have formulated it, risk minimization appears as a
vczriational problem. The result, therefore depends on the family of 'trial' strategies
N.Without this restriction, the average would of course be (within the present
$. A natural idea is thus to generalize the above procedure to the case where the hypothesis) zero.
The functions +E(x), for k = 1 , 2 , . . . , N, must be chosen so that the risk
strategy d, is allowed to vary both with time, and with the price of the underlying
is as small as possible. Technically, one must 'functionally' minimize Eq. (4.69).
asset. In other words, one can certainly do better than holding a certain fixed
quantity of the underlying asset, by adequately readjusting this quantity in the We shall not try to justify this procedure mathematically; one can simply imagine
course of time. that the integrals over x are discrete sums (which is actually true, since prices
are expressed in cents and therefore are not continuous variables). The function
$:(x) is thus determined by the discrete set of values of $ i Y ( i ) . One can then
take usual derivatives of Eq. (4.69)with respect to these $ F ( i ) . The continuous
4.4.3 General case: %' hedging limit, where the points r' become infinitely close to one another. allows one to
If one writes again the complete wealth balance, Eq. (4.26), as: define the functional derivative i?/i?$p(x), but it is useful to keep in mind its
discrete interpretation, Using Eqs(4.71) and (4.73), one thus finds the following
fundamental equation:
the calculation of (A w*)involves, as in the simple case treated above, three types
of terms: quadratic in y!r, linear in +.
and independent of +.
The last family of
tenns can thus be forgotten in the minimization procedure. The first two tenns of
R2 are: Setting this functional derivative to zero then provides a general and rather explicit
expression for the optimal strategy +:*Jx), where the only assunzption is that the
" The caie of correlated Gaussian increments is considered in lBouchaud and Somette]
4.1 Opi~i~ruI
.srirricg\. rtrrd ~~~siiirrrt/
risk Ihl
incren~entsSsx are uncomelated (cf. Eq. (4.70)):" the portfolio of the writer of the uption does not vary or ull with time (cf. below,
when we will discuss the differential approach of Black and Scholes). However. as
we discuss now, the relation between the optimal hedge and A is not general, and
does not hold for nonGaussian statistics.
This formula can be simplified in the case where the increments 6 . are
~ ~identically
distributed (of variance D s ) and when the interest rate is zero. As shown in Curnularzt corrections to 4 hedging
Appendix D, one then has: More generally, tising Fourier trarrsfortns, one can express Eq. (4.76) as a cunzulant
expansion. Using the deJinitioiz ofthe cumulanrs of P, one has:
The intuitive interpretation of this result is that all the N  k increments &xialong
the trajectoxy ( x , kj t (x', N) contribute on average equally to the overall price
change x'  x. The optimal strategy then finally reads:
where log pN ( i ) = C z 2 ( i z ) n c , , , f/la ! . Assuming that the increments are independent,
one can use the additivitypropery ofthe cumulants discussed in Chapter I , i.e.: c , , ~=
Nc,,J, where the c,,l are the cumulants ut thc elenzentav time scale s. One jrzds tile
following general fwmuEa for the optimal strategy:23
We leave to the reader to show that the above expression varies monotonically
from q!$*(m) = 0 to $/*(too) = 1; this result holds without any further
assumption on P ( x ' , N lx, k ) . If P (xi, NIX,k ) is well approximated by a Gaussian,
the following identity then holds true:
where c2,l = Dt., c 4 , ~zs ~ ~ [ c ~ .etc.
l ] *In, tile case o f a Gaussian distribution, c,,,, =: 0
for all n 5 3, and one thus recovers the previous 'Delta' hedging. g t f i e distribution of the
elementary increments 6xk is qmmetrical, c3,1 = 0. TheJirst correction is then of order
c 4 , ~ / c 2 ,xt ( I / W = K)I N~, where we have used the fact that C [ x , xs. N ] typically
The comparison between Eqs (4.76) and (4.33) then leads to the socalled Delta'
hedging of Black and Scholes:
varies art the scale x rr m, which allows us to estimate the order of magn;tr.tde of its
derivatives wiih respect to X. Equation (4.80) shows that in general, the opriinal strategy
is not simply gir.en by the derivative ofthe price of the c~ptionwith respect to the price of
rhe underlying asset.
It is interesting to conzpure rhe dgerettce befiveen 4%ui~dthe BlackScholes strategy
as used by the traders, +G, which takes into account the implied volatility qf the optio~z
C(x,, T = ~ C ' Z ) ; ~ ~
One can actually show that this result is true even if the interest rate is nonzero
(see Appendix D)*as well as if the relative returns (rather than the absolute returns)
are independent Gaussian variables, as assumed in the BlackSchoies model.
Ifone chooses the irttplied voiatility C in such a nJng that the 'correct'price (c$ E q 14.581
Equation (4.78) has a very simple (perhaps sometimes misleading) inteqreta above) is reproduced, one cun show that:
tion. If between time k and k + I the price of the underlying asset varies by a small
amount dxa, then to first order in dxk, the variation of the option price is given
by A [ x , x x IN  k ] dxi: (using the very definition of 4 ) .This variation is therefore
+*(XI= + & ( x , C ) +
K l S (X,
12T
 x0) x,  xoj2
[ ( 207 ] '
(4.82)
exactly compensated by the gain or loss of the strategy, i.e. g!f*(x = xk) hi.In where higherorder cuinulants are neglected. The BlackSclzoles struteg): even calculated
other words, #* = 4 (xk, N  k ) appears to be a perfect hedge, which ensures that
23 This formula can actually be generalized to the case where the volat~lityis time dependent, see Appendix D
22
+
This is true within the variational space cttnsidered here. where only depends on x and t . If some volatility and Section 5.2.3
correlations are present. one could in principle consider snategies that depend on the past values of the 24 Note that the market does not compute the total derivative of the optlon price with respect to the underlying
~olatility. price, which would also include the derivative of the implied volatility
uncorrelated, we show below that the upti~nalstrategy is illdeed identical to the
'global' one discussed above.
Tire ckurrge ofwealth herweeil rimes k and k f 1 is, in rlze ubsence of iirreresr rates, given
by:
AWk = Ck  Ck+i + $bk(xk)Sxk. (4.84)
The globul wealth balance is simply given by the sum over n ofall these S?Vk. Let us riorv
ca/culare the part of ( 6 ~ 2 which
) depends on +k. Using the,fact ritat rite incrernenrs are
rcncorrelured, one$r~ds:
Taking the derivative of this expression with respect to g)k finally leads to the optimal
strategy, Eq. (4.761,above.
For a daily kurtosis of K~ = 10 und,for N = 20 days, oneJinds rhar theJirst correction to
the optiiml hedge is on the order ofO.01, see also Figure 4.10. As will be discussed below
the use of such an approximate hedging induces a rzrlxer srnuil increase of the residual 4.4.5 Res2zial risk: the BlackSchoks miracle
risk.
We shall now compute the residual risk R*obtained by substituting d, by (b* in Eq.
(4.69). One finds:
to the option (calculated as the variance of the final wealth balance) is minimized is
where 'I& is the unhedged risk. The BlackScholes 'miracle' is the following: in the
eqriivalent to the minimization of an "instantaneous' hedging error  at least when
case where P is Gaussian, the two t e r n s in the above equation e.xacfly compensate
the increments are uncorrelated. The latter consists in minimizing the difference of
in the limit where t .I 0, with N r = T fixed.25
value of a position which is short one option and Q, long in the underlying stock.
between two consecutive times k t and ( k +
l j t . This is closer to the concern '' The 'zerorisk' property is true both u'ithin an .additive3 Gaussian model and the BlackSchoies multiplicative
of many option traders, since the value of their option books is calculated every model, or for more general continuoustime Brownian motions. This property is more easily obtained using
Ito's stochastic differential calculus, see below. The latter formalism is however only valid in the conrinuous
day: the risk is estimated in a 'marked to market' way. If the price increments are ii~nelirniii s = O j .
7'iii.r 'nrinicrrlous' corn[)c.r~rrifinrr
i due t o rr very peculiai propi,rf~of rile Gurr.c.siciir increments are not Gaussian, then R"can never reach zero. This is actually rather
d;.~~ril~~~iiorz,
,for ~'/?icfz:
intuitive, the presence of unpredictable price 'jumps' jeopardizes the diEerentia1
P G ( X ITiso,
. O ) & ( . ~ I X ? )  P G ( x ~Tlxo,
. O)Pc(xz. T/,ro,0 ) = stmtegy of BlackScholes. The miraculous compensation of the two terms in
D l T 1 FG(r,~Ix(). 0 )
~ P G (  T I T, I X , t ) ~ P G ( X T
ax
X,I)
ax
Z I,
dx dr . (1.90)
Eq. (4.89) no longer takes place. Figure 4.1 1 gives the residual risk as a function
of z for an option on the Bund contract, calculated using the optimal strategy
in~egratirrgthis equation rvith respect to xr and xz a$er muiriplicurion by the correspo,zd +*, and assuming independent increments, As expected, R" increases with z,
ing payofljitncrions, the lefthand side gives Ri.As for the righthand side, one recognizes but does rzot tend to zero when r decreases: the theoretical quality ratio saturates
the lirnit of C ~ S1 ' P ( x , k l s o , 0)[4:*(x)l2 dx when .E. = 0, where the sum becortles an around Q = 0.17. In fact, the real risk is even larger than the theoretical estimate
ii~tegml. since the model is imperfect. In particular, it neglects the volatility fluctuations,
Therefore, in the continuoustime Gaussian case, the Ahedge of Black and i.e. that of the scale factor yk. (The importance of these volatility fluctuations
Scholes allows one to eliminate completely the risk associated with an option, as for determining the correct price was discussed above in Section 4.3.4.) In other
was the case for the forward contract. However, contrarily to the forward contract words, the theoretical curve shown in Figure 4.11 neglects what is usually called
where the elimination of risk is possible whatever the statistical nature of the the 'volatility' risk. A MonteCarlo simulation of the profit and loss associated
fiuctuations, the result of Black and Scholes is only valid in a very special limiting to an atthemoney option, hedged using the optimal strategy +*determined
case. For example, as soon as the elementary time scale t is finite (which is above leads to the histogram shown in Figure 4.8. The empirical variance of
the case in reality), the residual risk R* is nonzero even if the increments are this histogram corresponds to Q r 0.28, substantially larger than the theoretical
Gaussian. The calculation is easily done in the limit where T is small compared estimate.
to T: the residual risk then comes from the difference between a continuous
The 'sroploss' strategy does nor work
integral D r f d t ' l Pc(x, t'jxg, O)+*'(X, t') dn (which is equal to Ri)and the
corresponding discrete sum appearing in Eq. (4.89). This difference is given by There is a very simple strategy that leads, at first sight, to a perfect hedge. This
the EulerMcLaurin formula and is equal to: strategy is to hold d, = 1 of the underlying as soon as the price x exceeds the strike
price x,,and to sell everything (4 = 0) as soon as the price falls below x,. For
zero interest rates, this 'stoploss' strategy would obviously lead to zero risk, since
either the option is exercised at time T, but the writer of the option has bought
where P is the probability (at t = 0) that the option is exercised at expiry (r = T ) . * ~ the underlying when its value was x, or the option is not exercised, but the writer
In the limit z .I0, one indeed recovers R* = 0, which also holds if P + does not possess any sfock. If this were true, the price of the option would actually
0 or I: since the outcome of the option then becomes certain. However, Eq. (4.91) be zero, since in the global wealth balance, the term related to the hedge perfectly
already shows that in reality, the residual risk is not small. Take for example an matches the option payoff!
i.
atthemoney option, for which P = The cornbarison between the residual risk In fact, this strategy does not work at all. A way to see this is to realize
and the price of the option allows one to define a 'quality' ratio Q for the hedging that when the strategy takes place in discrete time, the price of the underlying
strategy : is never exactly at xs,but slightly above, or slightly below. If the trading time
is r , the difference between x, and xk (where k is the time in discrete units)
,  is, for a random walk, of the order of m. The difference between the ideal
strategy, where the buy or sell order is always executed precisely at x,, and
with N = T / z . For an option of maturity I month, rehedged daily, Af . 25. the real strategy is thus of the order of N , m , where N, is the number of
Assuming Gaussian fluctuations, the quality ratio is then Q 2 0.2. In other words, times the price crosses the value x, during the lifetime T of the option. For an
the residual risk is onefifth of the price of the option itself. Even if one rehedges atthemoney option, this is equal to the number of times a random walk returns
every 30 min in a Gaussian world, the quality ratio is already Q 2 0.05. If the to its starting point in a time T. The result is well known, and is N, oc Jm
26 The above formula is only correct ~n the addliive limjt, bur can be generalized to any Gaussian model, in for T >> r . Therefore, the uncertainty in the final result due to the accumulation
particular the lognormal BlackScholrs model.
f" ' the above small errors is found to be of order m, independently of c, and
Residaai risk to fitst omlei in k~rrtu.si.~
li is iiiier.e.siiiig to cornjlufe firejr,st itoi~Gattssicr)lcorrection ro ihe rvsiiiiiul risk, bvhiciz fs
ind~rceiii)?' a ilonzero krrrrosis K I qftlle distribution at scale t. The expressiorr o j . 8 R * / a ~ ~
ca~lbe obtrriiledfroin Eq. (4.891, where tMo types of term appear: tlznse proliortiotzal to
a P / a ~ ! aid
, those proportionai to a q ) " j a ~ ~The . latter terms are zero sirwe b?.definition
the derivative of I?* with respect to 4" is nil. We thus find:
wizicit allows one to estimate the extra risk numerically, using Eq. (4.93). The order of
magtzitude of ail the above terms is given by a ~ * ' / a2.~ D~ t / 4 ! . The relative correction
to the Gaussian result, Eq. (4.91), is then simply givetl, very roughly, by the kirrtosis K , .
Therefore. tail efecrs, as measured by the kurtosis, constitute the major source of the
residiral risk when the trading time 7 is small.
therefore does not vanish in the continuoustime limit z .I0. This uncertainty is
of the order of the option price itself, even for the case of a Gaussian process. Ifthe observed kurtosis is entirely due to this stocltastic volatility efect, one has 6R*/R*cc
K I . One ihris recovers the result of the previous section. Again, tltis volatilif). risk can
Hence, the 'stoploss' strategy is not the optimal strategy, and was therefore represen, an appreciable fraction ofrile real risk, especially for frequently hedged options.
not found as a solution of the functional minimization of the risk presented Figure 4.11 aclually suggests that tizefraction qf the risk due to the$uctuating ~~olatilig is
above. compurable to rhut induced by the intrinsic kurtosis ofthe distribution.
4.4.6 Other nzeaszlres of risk Izedgirrg nrrd VnR ('k) leads to an optimal 'valueatrisk' strategy:
It is conceptually illuminating to consider the model where the price increments
Asi are Levy variables of index b < 2, for which the variance is infinite. Such
a model is also useful to describe extremely volatile markets, such as emergent
countries (like Russia!), or very speculative assets (cf. Chapter 2). In this case, the for 1 < p 5 2.27 For p < I, (p* is equal to 0 if A > A+ or to 1 in the opposite
variance of the global wealth balance A W is meaningless and cannot be used as a case. Several points are worth emphasizing:
reliable measure of the risk associated to the option. Indeed, one expects that the
e The hedge ratio 4* is independent of moneyness (x,  xo). Because we are
distribution of A W behaves, for large negative A W, as a powerlaw of exponent
interested in minimizing extreme risks, only the fas tail of the wealth distribution
b. matters. We have implicitly assumed that we are interested in moves of the stock
In order to simplify the discussion, let us come back to the case where the
price far greater than Ix,  xo1, i.e. that moneyness only influences the centre of
hedging strategy 4 is independent of time, and suppose interest rate effects
the distribution of A W .
negligible. The wealth balance, Eq. (4.28), then clearly shows that the catastrophic
e Lt can be shown that within this valueatrisk perspective, the strategy @* is
losses occur in two complementary cases:
actually time independent, and also corresponds to the optimal instantaneous
e Either the price of the underlying soars rapidly, far beyond both the strike price hedge, where the VaR between times k and k + 1 is minimum.
x, and xo. The option is exercised, and the loss incurred by the writer is then: e Even if the tail amplitude AWo is minimum, the variance of the final wealth is
still infinite for p < 2. However, A W: sets the order of magnitude of probable
losses, for example with 95% confidence. As in the example of the optimal
portfolio discussed in Chapter 3, infinite variance does not necessarily mean that
The hedging strategy 4, in this case, limits the loss, and the writer of the option the risk cannot be diversified. The option price, fixed for p > 1 by Eq. (4.33),
would have been better off holding 4 = 1 stock per written option. ought to be corrected by a risk premium proportional to A%$. Note also that
e Or. on the contrary, the price plummets far below xo. The option is then not with such violent fluctuations, the smile becomes a spike!
exercised, but the strategy leads to important losses since: e Finally, it is instructive to note that the histogram of A W is completely asym
metrical, since extreme events only contribute to the loss side of the distribution.
As for gains, they are limited, since the distribution decreases very fast for
In this case, the writer of the option should not have held any stock at all (4, = 0). positive AW.2"n this case, the analogy between an option and an insurance
contract is most clear, and shows that buying or selling an option are not at all
However. both dangers are a priori possible. Which strategy should one follow? equivalent operations, as they appear to be in a BlackScholes world. Note that
Thanks to the above argument, it is easy to obtain the tail of the distribution of A W the asymmetry of the histogram of A W is visible even in the case of weakly
when A W + m (large losses). Since we have assumed that the distribution of nonGaussian assets (Fig. 4.8).
X N  xg decreases as a powerlaw for large arguments,
As we have just discussed, the losses of the writer of an option can be very large
PA$ in the case of a Levy process. Even in the case of a 'truncated' LCvy process (in
P(xN, Njno, 0) 2:
ANA~*C?J lxN  . ~ ~' l ~

~ i
(4:98r
~ the sense defined in Chapters 1 and 2), the distribution of the wealth balance An'
remains skewed towards the loss side. It can therefore be justified to consider other
it is easy to show, using the results of Appendix C, that:
measures of risk, not based on the variance but rather on higher moments of the
PAW: distribution, such as R4 = ((AW4))li4.which are more sensitive to large losses.
P(AW) 2
AW+m lI1\WI1+iL
A w; = ~ t " , (i + A C " ~ ~ . (4.991 The minimization of R4with respect to 4,(x, t ) can still be performed, but leads
27 Note that the above strategy is still valid for ii > 2. and cornsponds to the optimal VaR hedge for powerlaw
The probability that the loss 1 A W 1 is larger than a certain value is then proportional tailed assets, see below.
to A Wj" (cf. Chapter 3). The minimization of this probability with respect to (p then 25 For atthemoney options, one can actually show that A W 5 C.
to a more coniplex equation for 4*,which has to be solved numerically. One finds hedge is indeed only of a few per cent for I month matuiity options. This diifesence
chat this optirnaf strategy varies rnore slowly with the underlying price x than that however. increases as the niarusity of the options decreases.
based on the minimization of the variance, which is interesting from the point of
view of transaction costs.29
Another possibility is to measure the risk through the valueatrisk (or loss 4.4.8 Summary
probability), as is natural to do in the case of the Levy processes discussed above. In this part of the chapter, we have thus shown that one can find an optimal
If the underlying asset has powerlaw fluctuations with an exponent p > 2, the hedging strategy, in the sense that the risk (measured by the variance of the
above computation remains valid as long as one is concerned with the extreme change of wealth) is minimum. This strategy can be obtained explicitly, with
tail of the loss distribution (cf. Chapter 3). The optimal VaR strategy, minimizing very few assumptions, and is given by Eqs (4.76), or (4.80). However, for a
the probability of extreme losses, is determined by l3q. (4.100). This strategy is nonlinear payoff, the residual risk is in general nonzero, and actually represents
furthermore time independent, and therefore is very interesting from the point of an appreciable fraction of the price of the option itself. The exception is the
view of transaction costs. BlackScholes model where the risk, rather miraculously, disappears. The theory
presented here generalizes that of Black and Scholes, and is formulated as a
variational theory. Interestingly, this means that small errors in the hedging strategy
4.4.7 Hedging errors
increases the risk only in second order.
The formulation of the hedging problem as a variational problem has a rather
interesting consequence, which is a certain amount of stability against hedging
errors. Suppose indeed that instead of following the optimal strategy +*(x,t), one 4.5 Does the price of an option depend on the mean return?
uses a suboptimal strategy close to #", such as the BlackSchoies hedging strategy 4.5.1 The case of nonzero excess return
with the value of the implied volatility, discussed in Section 4.4.3. Denoting the We should now come back to the case where the excess return m l . (6xk)= m r is
difference between the actual strategy and the optimal one by S#(x, t), one can nonzero. This case is very important conceptually: indeed, one of the most striking
show that for small 84, the increase in residual risk is given by: result of Black and Scholes (besides the zero risk property) is that the price of the
option and the hedging strategy are totaEly independent of the value of rn. This
= Dr
"'1
k=O
[ 6 4 ( x ,tk)12F'(x, klxo, 0) dx,
(4.101)
may sound at first rather strange, since one could think that if m is very large
and positive, the price of the underlying asset on average increases fast, thereby
which is quadratic in the hedging error 64, and thus, in general, rather small. For increasing the average payoff of the option. On the contrary, if m is large and
example, we have estimated in Section 4.4.3 that within a firstorder cumulant negative, the option should be worthless.
expansion, 64 is at most equal to 0.02~". where K ~ isT the kurtosis corresponding This argument actually does not take into account the impact of the hedging
to the terminal distribution. (See also Fig. 4.10.) Therefore, one has: strategy on the global wealth balance, which is proportional to rn. In other
words, the term max(s(N) x,. O), averaged with the historical distribution
P,, x.! . N lxo.0), such that:
For atthemoney options, this corresponds to a relative increase of the residual risk
given by:

is indeed strongly dependent on the value of m. However, this dependence is part]?
compensated when one includes the trading strategy, and even vanishes in the
For a quality ratio Q = 0.225 and K M = 1. this represents a rather small relative BlackScholes model.
increase equal to 2% at most. In reality, as numerical simulations show, the increase Let us first present a perturbative calculation, assuming that rn is small, or
in risk induced by the use of the BlackScholes Ahedge rather than the optimal more precisely that ( m T ! 2 / D T << 1. Typically, for T = 100 days, mT =
5%100/365 = 0.014 and 2 l%y/100 2 0.1. The term of order rn2 that
'9 This has k e n shown in the PhD work of Farhar Selrni (2000), unpublished
we neglect corresponds to a relative error of (0. 14)2". 0.02.
Tht liverage gain (or loss) induced by the hedge is equal to:"' Hence, grouplng together Ecls (4.108) and (4.110).one finally obtains the price oi'
the option in the presence of a nonzero average return as:
To order nt, one can consistently use the general result Eq. (4.80) for the optimal
Quite remarkably, the corrertio~zterms are zeru in the Griussin~tcuse, since all the
hedge q5*. established above for 1 ~ 7= 0, and also replace P, by P,,,,o:
cumulants c , , ~are zero for ti > 3. In fact, in the Gaussian case, this property holds
to all orders in m (cf. below). However, for nonGaussian fiuctuations, one finds
that a nonzero return should in principle affect the price of the options. Using
again Eq. (4.79), one can rewrite Eq. (4.11I) in a simpler form as:
C,, = Co + m T [P #*I (4.112)
where P is the probability that the option is exercised, and 4"the optimal strategy,
where Po is the unbiased distribution (rn = 0).
both calculated at t = 0. From Fig. 4.10 one sees that in the presence of Tat tails', a
Now. using the ChapmanKolmogorov equation for conditional probabilities:
positive average return makes outofthemoney options less expensive ( P < #*),
whereas inthemoney options should be more expensive ( P > d*). Again, the
Gaussian model (for which P = 4*) is mi~leading:~' the independence of the
one easily derives, after an integration by parts, and using the fact that option price with respect to the market 'trend' only holds for Gaussian processes,
FO(xi.N/xo.0) only depends on x'  xo, the following expression: and is no longer valid in the presence of 'jumps'. Note however that the correction
is usually numerically quite small: for xo = 100, m = 10% per year, T = 100
days, and lP  
0.02, one finds that the price change is of the order of 0.05
points, while C z 4 points.
= / v,  i n , N
(I!+ miN  xs)PO(xf,Nlxo, 0) d x l ,
+
where in the second integral, the shift xk 4 xk rnlk allowed us to remove the
whicil scitisjes rizefoiiobing equations:
bias nz, and therefore to substitute P, by Po. To first order in m, one then finds;
 5. :
(max(x(h7) s,,O)},, cr (max(x(iV)  xs. O))O +
m l ~ / ~ ~ u ( x ' . ~ l x o . ~ ) d(4.110)
*~.
is
30 In the following, we shall afain stick to an additive model and discard interest rate effects. i n order to focus " The fact that the optimal strategy is equal to the probability '7 of exercising the optlo:> also holds in the
on the main concepis. BlackScholes model. up to small 'o correction terms
vote ihut ille r ~ " ~ . [ ~equniiolz
ttd t~letrr7stizcti l/!i~ rl~oliriioilo j x i~rrcio.iile 'j~rohtrhilitj' Q i.7
irrrhicrseil, i.e. (x) I Q = .yo. Tl~i.,
is i/ze cic:fi~iiiio~~ ~ ~ I / L J /.~ee,
qf'o J I ~ ( ~ I  ~ I J ~ IIIYJ(.B.Y.( e.g. (Bci.crer]).
Eq'quntiotz (4.1 131, with the verj sanir Q, crctrrtrl!\. /rold~.fotUIZ nri~irtnr);pmyof~fiitlzi.tiorz
Y(.KN), replacing r n a x ( x ~ s,,0 ) in the uhotpe eqticiiioiz. LJsirzg Eq. 1479), Eq. (4.114)
cat/!ciiso be written in a more compact ,va? cis:
In tlze lirnii rnl = 0, tlze rzghthand side scinishes and one recovers the optintul strateg)
detertizitled above. For svtull n2, Eq. (4.118)is a convetlzientstarting point for a perturbatrie
expansiorz ir! rn 1.
Usin_gEq. (4.1 18), or2e can establish a sinzple relation.for p,
which fixes the correctiorr
to the 'Bachelierprice' coining from the hedge: C = (max(xN  x,. 0 ) )  m I F .
Note however that Eq. (4.113) is ratherforntal, sirzce nothing ensures rhar Q(x N ! x o ,0 ) is %
or else:
=
u a
J ~ ~ atD T
(14  u g  mt12
2DT I du d t ,
32
If $ = A , the result Q = Po is in fact correct (as we slow in Appendix F) beyond the first order in rn
However. the optimal strategy is not, in general, given by the option A , except in the Gaussian case.
The price of the option in the presence of a nonzero return i i r f O is thus givrn. in Usi1i,q agui~rthe CLT henvecn sc.trlt.\ r' << r iiiril r; ofre jj11ci.s tlriri LAi.s ri Grrus.riiili
the Gaussian case, by. \,a1iriOle uf' RMS clc A. On the oiirer hti~iri.siiice An' is the .stin?o f pot itilje iariahles, ir
I
i r eqliirl to in mean cTi2r/r' plus ternls qfoi$er J r / r l , icjhere c ~ ; * is fire vo;icrllceof Axi.
For coi~sistency,0;' T ~ U be set 0;' = Dr'.
S ~of order r ' ; Ire ,r.ill tiru,~
Ne11ce. i~zthe tinlit r' + 0. wiifz r fiied, AA' i,z Eq. (4.1253 becoures a ironmridonz
iuriable equnl to Dr, up to correctiot~sof order vm.
Notr, tctking the lirnit r + 0, otze
.finnlly$nds:
(cf. Eq. (4.43)). Hence, as announced above, the option price is indeed independent
tvhere lim,+o A x / r = dx/dt. Equatio~r(4.127) means that in the double limit r + 0,
of ni in the Gaussian case. This is actually a consequence of the fact that the trading r ' / r + 0:
strategy is fixed by g5* = 3C,/ax, which is indeed correct (in the Gaussian case) Thr secondorder derivative term does notpztctuate, and thus does not depend on the
even when nt # 0. These results, rather painfully obtained here, are immediate specific realization ofthe random process. This is obviorrsl? at the heart of the possibility
within the framework of stochastic differential calculus, as originally used by Black ?fjndiAg a riskless hedge in this case.3"
and Scholes. It is thus interesting to pause for a moment and describe how option Higherorder derivatives are negligible in the linrit s = 0.
e Eqtiation (4.127) remains valid in the prese~rceo f a ?ionzero bias m.
pricing theory is usually introduced.
Let us now apply the formula, Eq. (4.127) to the difference df/dt between dCJdt
Ito c n l c u l ~ s ~ ~ and g5 dx Jdt. This represents the difference in the variation of the value of the
The idea behind lto's stochastic calctdus is thefollonzing. Suppose that one has to consider portfolio of the buyer of an option, which is worth C, and that of the writer of the
a certain function f ( x , t ) , n~iterex is a tirnedependent variable. I f x w w 'ordinary'
option who holds 4(x, t ) underlying assets. One finds:
+
variable, ffze variation A f ofrhe function f bemeen time t and t r would be given, jbr
snzail r , by:
r)
+ D2 a2C(x,ax,,

x2
T 
with A.r = ( & / d t ) r . The order s2 in the above e,rpansiort looks negiigibie in ?fielimit
r + 0. However, i j x is a stochastic variable with indeperzdent increments, the order qf One thus immediately sees that if 4 = 4* s 3C(x, x,, T  r)/3n, the coefficient
magnitude o f x i t )  x ( 0 ) isjxed by tire CLT and is thus given b j 01 .$,where .$ is a
Gaussian raridonz i'uriable ofwidth equal to 1 , and 01 is the RMS of Ax. of the only random term in the above expression, namely d x / d t . vanishes. The
g t h e lirrzit r + 0 is to be well d&ed and nontrivial f i x . such that the randon7 variable evolution of the difference of value between the two portfolios is then known with
[ sfillplays a role), one should thzis reqriire that 01oc 8.Since c ~ is 1 the RMS of !dx/dt)s, certainty! In this case, no party would agree on the contract unless this difference
h i s means that the order oftnagnitude of d r / d t is proportionai to I /&. Hence the order remains fixed in time (we assume here, as above, that the interest rate is zero). In
of nzagnitude q f d x 2 = ( d x / d t ) * r 2is riot r 2 bzrt r : one should thetefore keep this tern7 11.1
the expansion of Af to order r.
other words, d f Jdt . 0, leading toa partial differential equation for the price C:
The ozrciai point o f Ito's ciifferential calcuius is that f tfre stochastic process is u
continuoustime Gaztssian proces~,the71for any snzall butjnite time scaie r , A x is gireiw'y
the result o f un irzJinite sum of eiernentary increments. Therefore, one can rewrite Eq.
(4.125/, choosing as a new eletnentan~time step r' << s, nnd sum nil these r / r ' ec rrations with a 'final' boundary condition: C ( x , x,,0 ) max(x  x,, 0), i.e. the value of the
to obtain A f on the scaie r. Using thefact that for srnuli r , af l a x and a2f /ax do not 1 option at expiry, see Eq. (4.48). The solution of this equation is the above result;
I ' N I~ I Z U one
C ~ finds:
,
Eq. (4.43), obtained for r = 0, and the BlackScholes strategy is obtained by
taking the derivative of the price with respect to x , since this is the condition under
which dx/dt completely disappears from the game. Note also that the fact that the
'' The fo!!o\tring section is obviously not intended to be rigorous. 34 It is precisely for the same reason that the risk
two values, see Appendix E.
is also .oflo for a binomial process, where 6'a can only take
average return In is zero or nonzero does nor appear in the above calculation, The options tvouid actually be useless, since they would be eq~~ivalent
to holding 3
price and the hedging strategy are therefore con~pletelyindependent of the average certain nurnber of the underlying asset (given by the A ) .
return in this framework, a result obtained after rather cumbersome considerations
above.
4.6 Conclusion of the chapter: the pitfalls of zerorisk
The traditional approach to derivative pricing is to find an ideal hedging strategy,
which perfectly duplicates the derivative contract. Its price is then, using an
4.5.3 Conclusion. Is the price of an option unique?
arbitrage argument, equal to that of the hedging strategy, and the residual risk
Summarizing the above section, the Gaussian, continuoustime limit allows one is zero. This argument appears as such in nearly all the available books on
to use very simple differential calculus nrles, which only differ from the standard derivatives and on the BlackScholes model. For example, the last chapter of
one through the appearance of a secondorder rlun~uciltnrirlgtenn the socalled [Hull], called 'Review of Key Concepts', starts by the following sentence: The
'Ito correction'. The use of this calculus rule immediately leads to the two main pricing of derivatives irzvolves the construction of riskless hedges f i y f ? z traded
results of Black and Scholes, namely: the existence of a riskless hedging strategy, securities. Although there is a rather wide consensus on this point of view, we
and the fact that the value the average trend disappears from the final expressions. feel that it is unsatisfactory to base a whole theory on exceptioional situations: as
These two results are however not valid as soon as the hypothesis underlying explained above, both the continuoustime Gaussian model and the binomial model
Ito's stochastic calculus are violated (continuoustime, Gaussian statistics). The are very special models indeed. We think that it is more appropriate to stan from
pproach based on the global wealth balance, presented in the above sections, is
7
by far less elegant but more general. It allows one to understand the very peculiar
the ingredient which allow the derivative markets to exist in the first place, namely
risk. In this respect, it is interesting to compare the above quote from Hull to the
nature of the limit considered by Black and Scholes. following disclaimer, found on most Chicago Board Options Exchange documents:
As we have already discussed, the existence of a nonzero residual risk (and Optiutz trading involves risk!
more precisely of a negatively skewed distribution of the optimized wealth balance) The idea that zero risk is the exception rather than the rule is important for a
necessarily means that the bid and ask prices of an option will be different, because better pedagogy of financial risks in general; an adequate estimate of the residual
the market makers will try to compensate for part of this risk. On the other hand, risk  inherent to the trading of derivatives  has actually become one of the major
if the average return m is not zero, the fair price of the option explicitly depends concern of risk management (see also Sections 5.2, 5.3). The idea that the risk is
on the optimal strategy $*, and thus of the chosen measure of risk (as was the case zero is inadequate because zero cannot be a good approximation of anything. It
for portfolios, the optimal strategy conesponding to a minimal variance of the final furthermore provides a feeling of apparent security which can prove disastrous on
result is different from the one conesponding to a minimum valueatrisk). The some occasions. For example, the BlackScholes strategy allows one, in principle,
price therefore depends on the operator, of his definition of risk and of his ability to hold an insurance against the fall of one's portfolio without buying a true Put
to hedge this risk. In the BlackScholes model, the price is uniquely detennined option. but rather by following the associated hedging strategy. This is called an
since all definitions of risk are equivalent (and are all zero!). This property is often 'insurance portfolio', and was much used in the l980s, when faith in the Black
presented as a major advantage of the Gaussian model. Nevertheless, it is clear Scholes model was at its highest. The idea is simply to sell a certain fraction of the
that it is precisely the existence of an ambiguity on the price that justifies the very portfolio when the market goes down. This fraction is fixed by the BlackScholes
existence of option market^!^" market can only exist if sume uncertaintyremiiins: "  . Aof the virtual option, where the strike price is the security level below which the
I11 this respect, it is interesting to note that new markets continually open, where investor does not want to plummet. During the 1987 crash, this strategy has been
more and more sources of uncertainty become tradable. Option markets correspond particularly inefficient: not only because crash situations are the most extremely
to a risk transfer: buying or selling a call are not identical operations (recall the nonGaussian events that one can imagine (and thus the zerorisk idea is totally
skew in the final wealth distribution), except in the BlackScholes world where absurd), but also because this strategy feeds back onto the market to make it crash
further (a drop of the market mechanically triggers further sell orders). According
3' Thi\ ambiguity is related to the residual risk, which. as discussed above. comes both from the presence of
price 'jumps', and from the very uncenainiy on the parameters drscribing the distribution of price changes to the B r d y commission, this mechanism has indee.d significantly contributed to
('ro1a:ility risk'). enhance the amplitude of the crash (see the discussion in [Hulllj.
4.7 Appendix D: computation of the conditional mean kolatility is time dependent. In the Gaussial~case, all the c,, for 11 3 3 are zero,
On many occasions in this chapter, we have needed to compute the mean value and the last expression boils down to:
of the instantaneous increment S n k . restricted on trajectories starting froin sn and
ending at xhr. We assume that the S.xr's are identically distributed, up to a scale
factor yk. In other words:
This expression can be used to show that the optimal strategy is indeed
the BlackScholes A,for arbitrary Gaussian increments in the presence of a
nonzero interest rate. Suppose that
The quantity we wish to compute is then:
where the 6xk are identically distributed. One can then write xw as:
where the 6 function insures that the sum of increments is indeed equal to x, 
xk.Using the Fourier representation of the 6 function, the righthand side of this The above formula, Eq. (4.137), then reads:
equation reads:
with yk = (1 + P )N  k  1 ,
* In the case where all the yk's are equal to yo, one recognizes:
which allows one to generalize the optimal strategy in the case where the
one sees that in the two possible cases (6.rn = S.rr, l ) , the value of the hedging with J' representing the payoff ot the option. This equation can be solved by
portfolio is strictly eqzrcll to the option value. The option value at time i k is thus making the following ansatz for C':
equal to C"(xx) = f B k , independently of the probability p [resp. 1  p]
that S s k = S.u! [resp. 8 ~ ~One1 . then determines the option value by iterating this
procedure from time k + 1 = N ,where C N is known, and equal to max(xN x,, 0).
It is however easy to see that as soon as Sx;i can take three or more values, it is where R is an unknown kernel which we try to determine. Using the fact that the
impossible to find a perfect strategy.'' option price only depends on the price difference between the strike price and the
The Ito process can be obtained as the continuum limit of the binomial tree. 71tres:
present price of the underlying, this 8'
But even in its discrete form, the binonlial model shares some important properties
with the BlackScholes model. The independence of the premium on the transition
probabilities is the analogue of the independence of the premium on the excess
return m in the BlackScholes model. The magic of zero risk in the binomial model
can therefore be understood as follows. Consider the quantity s 2 = (6xk  ( S X ~ ) ) ~ ;
s2 is in principle random, but since changing the probabilities does not modify the
1,
option price one can pick p = making 's nonfluctuating (s2= (6xi  iSx212/4).
The Ito process shares this property: in the continuum limit quadratic quantities do
not fluctuate. For example, the quantity Now, using the ChapmanKolmogorov equation:
s2= 10
lim (x((k + I)r)  x ( k r )  mr)'.
k=O
one obtains the following equation for Q (after changing k + N  k ) :
is equal to DT with probability one when x ( t ) follows an Ito process. In a sense,
continuoustime Brownian motion represents a very weak form of randomness
since quantities such as S' can be known with certainty. But it is precisely this
property that allows for zero risk in the BlackScholes world.
The solution to this equation is R (x'  x,,k ) = Y(x'  xs  m k ) . Indeed, if this
is tile case, one has:
where the last equality holds in the small r , large N limit, when the sum over k can
be approximated by an integr;il.37Therefore, the price of the option is given by:
35
For a recent analysis along the tines of the present book. see E. Aureii, S. Simdyankln. Ricinp Risky Options
ofTheoretica1 mzd.4ppiied Fir:iit:ce. 1. I (1998). See also M. Schweizer, Risky
S1rnpl.i. ?~~terriurionu/Jor~nral
options simplified. irrternnfiorinlJournnl of Tileorericul ur~dApplied finar~ce,2, 5Y 11999). 3i The resultin5 error ir of the order of i n ' r f ' ~ .
184 F ~ t i u rcltlti
~ ~ .opriotr.s:
~ jirtidcit>~ri~iiii
r.citrc.~.jti.s
Optinml $filters
N . Wiener, Extmpolution, Interpolation and Smootlzing of Erne Series, Wiley, New York.
1949.
A. Papoulis. Probabiliiy, Randor,r Variables, arzd Stochastic Processes, 3rd edn, McGraw
Hill, New York, 1991.
Options afidfutures
J. C . Hull, Futures. Optiorzs and Other Derivatives, Prentice Hall, Upper Saddle River, NJ,
1997.
P. Wilmott, J. N.Dewynne, S . D. Howison, Option Pricing: Mathenzuticai *&'odeis and
Computation, Oxford Financial Press. Oxford. 1993.
 3. 1
Writing xi, = + CtiA 6.v,, we find that ( A W,s)is the surn of';, term proportional M~dtiplicrrtii'rrrioiicl
to .YO and the rerr~ninder.The tint contribution therefore reads: Lec us finally assurne that xki1 xk = ( ~  S + I I ~) X I , where the r ~ ~are
' s indepcrideitt
random variables of zero mean. Therefore, the average cost of the strategy is zero.
Furthermore, if the elementary time step s is small. the terminal value . v ~can be
written as:
This term thus has exactly the same shape as in the case of a nonzero average
return treated above, Eq. (4.1051, with an effective mean return
6)xo. If we choose for ip* the suboptimal Ahedge (which we know does only
= (p 
induce minor errors), then, as shown above, this hedging cost can be reabsorbed in
log (z) =
NI
k=O
log(1 + p  6 + i l k ) 2 N ( p  6)+ ).:
thus X N = .xOe('d)T~Y.Introducing the distribution P ( y ) , the price of the option
Y =
N1
k=O
i l k . (5.12)
a change of the terminal distribution, where XN  xo becomes x~  xo  N ELeffhr, or can be written as:
else, to first order: X N  xo exp[(p  6)Nl. To order (p  6 ) N , this therefore again
corresponds to replacing the cursent price of the underlying by its forward price. C(xO,x,, T ; r, dj = e?" P (y) m a ~ ( ~ ~ e(xS,
~ 0) ~ , ' ~ ' ~(5.13)
~ dy
For the second type of term, we write, for j c k:
which is obviously of the general form, Q. (5.1).
We thus conclude that the simple rule, Eq. (5.1), above is, for many purposes,
sufficient to account for interest rates and (continuous) dividends effects. A more
This type of term is not easy to calculate in the general case. Assuming that the accurate formula, including corrections of order ( r  d)T depends somewhat on
process is Gaussian allows us to use the identity: the chosen model.
.xn  x j a P ( ~ kklxj,
. j).
P ( x k , klxj.j) = Ds
k  j 3x.k 5.1.2 Interest rates corrections to the hedging strategy
one can therefore integrate over x j to find a result independent of j: It is nlso important to estimate the interest rate corrections to the optirnal hedging strategy.
Here again, one could consider the above models, that is, the sysrematic dr$t model,
the independent model or the multiplicative model. In the fomter case, the result for a
general temzirtal price distribution is rather involved, main1.y due to rhe appearance ofrize
conditional awluge derailed in Appendix D, Eq. (4.140). Irt the case where this di.rtribution
or, using the expression of q5f*(xk) in the Gaussian case: is Gaussian, the result Is simply BlackScholes' A hedge, i.e. the optirnaf strategy is the
derivative of the option price with respect to the price of the underlying contract (see Eq.
14.781).As discussed above, this simple rule leads in general to u suboptir~mistrategy, but
the relarive increase of risk is rather small. Tke Ahedge procedure, ort the other hand, has
the advantage that the price is independent offhe average refur71 (see Appendix F).
The contribution to the cost of the strategy is then obtained by summing over k and 117 rite 'independerzt' model, where the price change is unrelated to tire infelestrute andor
over j < k, finally leading to: dividend, the order p correcrion to the optimal strategy is easier to estimufe in general.
From the general for~nula,Eq. (4.1183, one jnds:
This extra cost is maximum for atthemoney options, and leads to a relative
rd xx'P(n,kl.t',&)P(.~'.Cl.~~.O)
increase of the call price given, for Gaussian statistics, by: @iO(x')d x '
P ( x , k/xo.O)
6C N T rd
 = (p  6) =  ( r
 d). .z  X
X' P ( x ' , ejx, k ) @ t o i x ' )drd
C 2 2
which is thus quite small for short maturities. (For r = 5% annual, d = 0, and
T = LOO days, one finds 6C/C 5 17c.) M here 4
'; is rhe optimal srraregyfor r = d = O
e ihc most important part in the variation of' 4; is due to the change of price of the
underlying, and not froin the explicit time dependence of 4,"(this is justified in
the limit where r << 7');
e 6x1 is small enough such that a Taylor expansion of qi* is acceptable;
e the kurtosis effects on #* are neglected, which rneans that the simple Black
Scholes A hedge is acceptable and does not lead to large hedging errors.
More generally, for an arbitrary dividend dk (per share) at time g , the extra term in
the wealth balance reads:
since a$;(x)/ax is positive. The average total cost associated to rehedging is then.
after a straightforward calculation:'
Very often, this dividend occurs once a year, at a given date ko: dk = dOSk,ko
In this case, the corresponding share price decreases immediately by the same
amount (since the value of the company is decreased by an amount equal to do The order of magnitude of (l8xl) is given by uixg: for an atthemoney option,
times the number of outstanding shares): x + n  do. Therefore, the net balance P(x,, Nlxo, 0) 2 (alxo2/75)! ; hence, finally, (A We) "
v f i . It is natural to
+
dk 8xk associated to the dividend is zero. For the same reason, the probability compare (AW,,) to the option price itself, which is of order C . u l x o f i :
P h (x, N lxo, 0) is given, for N > ko, by P d o = o ( t
~ do, N IxO,0). The option price
is then equal to: Cd,(x, +
x,, N) = G(x, X, do, N). If the dividend do is not known
in advance with certainty, this last equation should be averaged over a distribution
This part of the wansaction costs is in general proportional to xo: taking for example
P(do) describing as well as possible the probable values of do. .4 possibility is
1) = 104xo, r = 1 day, and a daily volatility of a1 = I%, one finds tbat the
to choose the distribution of least information (maximum entropy) such that the
transaction costs represent 1% of the option price. On the other hand, for higher
average dividend is fixed to 2,which in this case reads:
.transaction costs (say v = 10~x,~), a daily rehedging becomes absurd, since the
ratio, Eq. (5.21), is of drder 1. The rehedging frequency should then be lowered,
such that the volatility on the scale of r increases, to become smaller than v .
The fixed part of the transaction costs is easier to discuss. If these costs are equal
5.1.4 Transaction costs to u' per transaction, and if the hedging strategy is rebalanced every c, the total
The problem of transaction cosrs is important: the rebalancing of the optimal hedge cost incurred is simply given by:
as time passes induces extra costs that must be included in the wealth balance as
well. These kosts' are of different nature  some are proportional to the numbe? of 
"
tiaded shares, whereas another part is fixed, independent of the total amount of the Comparing the two types of costs letds to.
operation, Let us examine the first situation, assuming that the rehalancing of the
hedge takes place at every time step T . The corresponding cost in then equal to:
showing that the latter costs can exceed the former when N = T / T is large, i.e.
where v is a measure of the transaction costs. In order to keep the discussion simple, when the hedging frequency is high.
we shall assume that: "
I One should add to the following formula the cost associated *ith the initial value ofthe hedge #;
in sumrrlary. the transaction costs arc higher when the trading time is smaller.
On the other hand. decreasing of s allows one to diminish the residual risk (Fig.
4. i I ). This analysis suggests that the optimal trading frequency should be chosen
such that the transaction costs are comparable to the residual risk.
wheieas the optimal strategy is still glven b) the general fomula. Eq +
particular, for Gaussian fluctuations, one always finds the BlackSchoier r I
1
5.2 Other types of options: 'Puts' and 'exotic options?
5.2.1 lP~ltcaEl'parity
A 'Put' contract is a sell option. defined by a payoff at rnaturity given by max(.x,  The case o f a nonzero average return can be treated as ahoie, in Sect~on4 5 1
X, 0). A put protects its owner against the risk that the shares he owns drops below older m m, the prlce ofrhe optlon 1s gzven b y
the strike price s,.Puts on stock indices like the S&P 500 are very popular. The
price of a European put will be noted Ct[xo,xi, TI (we reserve the notation P
for a probability). This price can be obtained using a very simple 'parity' (or no
arbitrage) argument. Suppose that one simultaneously buys a call with strike price which reveals, again, that in the Gaussian case, the rtverage retuin disc~ppeurs
rr,, and a put with the same maturity and strike price. At expiry, this combination is .finalfomtula. In the presence ofnonzero kurtosis, however, a (snlallj systemaric c
to the.fair price appears. Note that Cy,,,car1 be written as an average o f y ( x j I:
therefore worth: eflecrive, 'risk neurrul' probabiliv Q ( x ) introduced in Section 4.5.1. lj'the A
used, this risk neutral probability is sinzply Po (see Appendix F).
max(x(T)  x,,0)  max(x,  x ( T ) , 0) x(T)  x,. (5.24)
Exactly the same payoff can be achieved by buying the underlying now, and selling
a bond paying x, at maturity. Therefore, the above calliput combination must be 5.2.3 %siatz7 options
worth: The problem is slightly more complicated in the case of the socalk
CLxo, xs, TI  Ci[,xO,x,. TI = xo  ~ , e  ' ~ , (5.25) options. The payoff of these options is calculated not on the Val
underlying stock at maturity, but on a cerkain average of this value ow
which allows one to express the price of a put knowing that of a call. If' interest rate number of days preceeding maturity. This procedure is used to prevent an
effects are small ( r T << I), this relation reads: rise of the stock p i c e precisely on the expiry date, a rise that could be
by an operator having an important long position on the corresponding o p
contract 1s thus constructed on a fictitious asset, the price of which being
Note in particular that atthemoney (x, = xoj, the two contracts have the same as:
value, which is obvious by symmetry (again: in the absence of interest rate effects). N
2= C
t0
WiXi,
The price of the option in this case can be obtained following the same lines as
however consider more complicated situations, for example an exponepjial 7;
) . wealth balance then contains the modified payofi: m:~
(uik a . Y ~  ~The
above. In particular, in the absence of bias (i.e. for nz = 0) the fair price is given s,,0). or more generally Y ( 2 ) The
. first problem therefore concerns thi
.\. As we shall see, this problem is very similar to the case encounter.ed i n Chapter Note that ii' the instant of time 'k' is outside the averaging period, one has y, = i
4 where the volatility is time dependent. Indeed. one has: (since z,,, w, = I ) , and the formula, Eq. (4.80). is recovered. If on the contrary
k gets closer to maturity, y, diminishes as does the correction term.
In order to fix the optimal strategy, one must however calculate the following
which can be rewritten as:
quantity:
conditioned to a certain terminal value for f (cf. Eq. (4.74)).The general calcula
tion is given in Appendix D. For a small kurtosis, the optimal strategy reads:
The last expression means that if the buyer exercises a fraction f (XI)of his option,
he pockets immediately the difference x i  x,, but loses de fncto his option, which
is worth C[xl, x,, N2  iVi].
 The case of a mirllipiicative pracess is more involved: see, e.g. H Gemam, M. Yor, Besse! processes. Asian
Options and Perpetuities, Mathematical Firinnce, 3, 349 (1993). Options that can be exercised ar certain specific dates (more than one) are called 'Bermudan' options
The optirnal strategy, such that (6;) i s maxilnum, thereiore consists in choosing i Now, the difference (x,  .ii  Ct[xi, x,,izi,  N i l ) can be transformed, using rile
J'(sl) equal to 0 or 1. according to the sign of .\I  x,  C[xi, x,, N2  N,]. Now, putcall parity, as:
this difference is always negative, whatever .xi and N2  iV1.This is due to the I
putcall parity relation (cf. Eq. (5.26)): xs[l  e'T(N2N')] C[xl, x,,N ,  N i l . (5.48)
This quantity may become positive if C[xl, x,, Ri2  N1] is very small, which
corresponds to s, >> xl (Puts deep in the money). The smaller the value of r , the
Since C' 2 0; C[xl. .r,. N1!  N I ]  (xl  x,) is also greater or equal to zero. larger should be the difference between xl and x,, and the smaller the probability
The optimal value of f ( X I )is thus zero; said differently the buyer should wait for this to happen. If r = 0 , the problem of American puts is identical to that of the
until maturity to exercise his option to maximize his average profit. This argument calls.
can be generalized to the case where the option can be exercised at any instant In the case where the quantity (5.48) becomes positive, an 'excess' average profit
N l , N2, . . . , N,,with n arbitrary. 6G is generated, and represents the extra premium to be added to the price of the
Note however that choosing a nonzero f increases the total probability of European put to account for the possibility of an early exercise. Let us finally note
exercising the option, but reduces the average profit! More precisely, the total
that the price of the American put Cim is necessarily always larger or equal to x,  x
probability to reach x, before maturity is twice the probability to exercise the
(since this would be the immediate profit), and that the price of the 'twoshot' put
option at expiry (if the distribution of 6x is even, see Section 3.1.3).OTC American
is a lower bound to C:m.
options are therefore favourable to the writer of the option, since some buyers might
be tempted to exercise before expiry. The perti*rbafiGecalculation 0 1 8 6(and thus ofthe 'twoshot ' option) in the limit ofsmall
interest rates is not very difficult. As a function of N 1 , JG Ieachesa niaxi~nu~n befiveen
It is interesting ro generalize the problem and corisider the case where the two strike N2/2 and Nz. For ari atthemoney put such that N2 = 100, r = 5% annual, o = 1%
prices x,l and xs2 are dlferent at times N1 and 1V2, in particular in the case where x, 1 < per day and xo = I ,= 100, the maximum is reached for N1 2: 80 and the corresponding
x , ~ .The averageprojt, Eq, (5.43).is rhen equal to vor r = 0): 86 2: 0.15. This must be compared with the price of rhe European pur, which is Ci 4.
The possibiliv of an early exercise leads in this case to a 5% irzcrease of the price of rhe
option.
More generally, when the increments are indepertdent and of average zero, one ccrri
obtairr a nunzerical value for the price of an Anterican put C,; b~ iterati~zgbachr.ards
The equatio?l tilefollo~iinge.xact equation:
 C [ X *s. , ~N,
,  Nl] = 0 (5.46)
.v*  .Y,I
N + I]
CL~[X.X,, = max +
PI (Jx)eL[.x Gx,x,, iV] dGx
then has a nontrivial solutio~l,leading to f(xi) = 1 !or xi > x*. The average projt of
the buyer therefore increases, iri this case, itport an earlyexerrise o j the option. This equurion means that the put is worth the averiige value of tornorrobv's price i f it is
not e.xe,ri.ied t o w (c&, > x,  x), or n,  x if it is imrnediarely exercised. Using this
American purs 1 procediire, L1.r have calczlluted the price of a European, American und 'twoshot ' oprion
Naively, the case of the American puts looks rather similar to that of the calls, and ofmaturitj 100 duys (Fig. 5.1). For the 'twoshot' option, the optimal volile of Ni as a
function ofrhe strike% S I I O U ~in
~ the inset.
these should therefore also be equivalent to European puts. This is not the case
for the following r e a ~ o nUsing
.~ the same argument as above, one finds that the ,
average profit associated to a 'twoshot' put option with exercise dates Ni, N2 is
5.2.5 'Barrier' options
given by:
Let us r~owturn to anorherfamily ofoptions, called 'barrier'optioas, which are such tllur iJ
the price ofthe underlying .xk reaches a certain 'burrier' value x5 during the lfetime ofthe
option, tire option is lost. (Conversely, there are options that are only activuted $the value
x5 is reached.) This clause leads to cheaper oi7tions, which can he more utiruciive to the
investox Also. i f s b > n,, the writer ofthe opfion limits his possible losses to xb I,. What
4 is the prohuhilify P5(.x. Nixo, 0 )for tlze$nal value ofthe underlying to h e a t s , condirioned
The case of Amencan calls with nonzero dividends 1s similar to the case discussed here. to ihe fact that rhe price has not reached the burrier value .x5?
Fig. 5.2. Illustration of the method of images. A trajectory, starting from the point r o = 5
and reaching the point xzo =  1 can either touch or avoid the 'barrier' located at xb = 0.
For each trajectory touching the barrier, as the one shown in the figure (squares), there
exists one single trajectory (circles) starting from the point xo = 1 5 and reaching the same
final pointonly the last section of the trajectory (after the last crossing point) is common
Fig. 5.1. Price of a European, American and 'twoshot' put option as a function of the to both trajectories. In the absence of bias, these two trajectories have exactly the same
strike, for a 100days maturity and a daily volatility of 1% and r = 1'31~.The top curve is statistical weight. The probability of reaching the final point without crossing xb = 0 can
the American price, while the bottom curve is the European price. In the inset is shown the thus be obtained by subtracting the weight of the image trajectories. Note that the whole
optimal exercise time N1as a function of the strike for the 'twoshot' option. argument is wrong if jump sizes are not constant (for example when Sx = Jr1 or Jr2).
In sorne cases, it is possible to give an exact answer ro this questiorl, using the socalled
method ofiniicges. Let us suppose tltat for each time step, the price .r can on/},cizange by
arz discrete amount, It1 tick. The method of images is explained graphically in Figure 5.2:
one can notice tlzat all the trajectories going titrough xb between k = 0 and k = N has a
'mirror' trajectory wlth a statistical weight precisely equal (for m = 0 ) to rhe one ofthe
trajectory one u*isi?esto exclidde. It is clear that the conditional probability we are looking (xb > x,); the option is ~vortlllesswhenever ro < sib < x,.
for is obtained by subtracting the weight of these image trajectories: One can also jind 'double harrier' options, such titat the price is constrai~edto remain
within a certain channel .xb i x i xi;',or else the option can is he,^. One can generalize
the method of images to this case. The images are now successive rejlections ofrlze starting
In the general cuse where the variations of x are not limited to 0 , i 1, tile previous point xo in the two parallel 'mirrors' xi,,.x:
urgumentfails, as orze can easily be convinced by considering the cuse wkere.Jx takes the,
values It1 ortd It2. how eve^ $the possible variations ofthe price during ihe time r are Other q p e s of option
.small, rhe error con~ir~g
from the uncertainty about the exact crossing time is small, and
leads fo an error on the price Cb ofthe barrier option on the order of(l8xl)rimes the total One can find many other types of option, which we shall not discuss further.
probahilio of ever touching the barrier: Discarding this correction, the price of barrier Some options, for example, are calculated on the maximum value of the price
options reads:
of the underlying reached during a certain period. It is clear that in this case, a
Gaussian or lognormal model is particularly inadequate, since the price of the
option is governed by extreme events. Only an adequate treatment of the tails of
the distribution can allow us to price this type of option correctly.
I
5.3 The 'Greeks' and risk control 5.4 Valueatrisk for general nonlinear portfolios (*)
The 'Greeks', which is the traditional name given by professionals to the derivative A very important issue for the control of risk of cornplex portfolios, which involves
of the price of an option with respect to the price of the underlying, the volatility, rnany nonlinear assets, is to be able to estimate its valueatrisk reliably. This is
etc., are often used for local risk control purposes. Indeed, if one assumes that the j a difficult problem, since both the nonGaussian nature of the fluctuations of the
underlying asset does not vary too much between two instants of time r and t + t, underlying assets and the nonlinearities of the price of the derivatives must be dealt
one may expand the variation of the option price in Taylor series: with. A solution, which is very costly in terms of computation time and not very
6C = A6x
1
+ r(6x)' + V6a + Or. (5.53)
1 precise, is the use of MonteCarlo simulations. We shall show in this section that
in the case where the fluctuations of the 'explicative variables' are strong (a more
2 precise statement will be made below), an approximate formula can be obtained
where 6x is the change of price of the underlying. If the option is hedged by for the valueatrisk of a general nonlinear portfolio.
simultaneously selling a proportion q5 of the underlying asset, one finds that the i Let us assume that the variations of the value of the portfolio can he written
change of the portfolio value is, to this order: as a function 6f (et, el., . . . , eu) of a set of M independent random variables e,,
= I , . . . , M , such that (e,) = 0 and (eaeb) = 6a,b~:.The sensitivity of the
6W = ( A  4)Sx + 21~ ( 6 x ) ' + V6a + O r . (5.54) I portfolio io these 'explicative variables' can be measured as the derivatives of the
value of the portfolio with respect to the e,. We shall therefore introduce the A's
Note that the BlackScholes (or rather, Bachelier) equation is recovered by setting and r ' s as:
@* = A, 6a . 0, and by recalling that for a continuoustime Gaussian process,
( 8 x 1 ~= D t (see Section 4.5.2). In this case, the portfolio does not change with
time (6 W = 0), provided that O =  D r / 2 , which is precisely Eq. (4.51) in the
We are interested in the probability for a large fluctuation 6f * of the portfolio.
limit t + 0.
We will surmise that this is due to a particularly large fluctuation of one explicative
In reality, due to the nonGaussian nature of 6x, the large risk corresponds to 1 factor, say a = I , that we will call the dominant factor. This is not always true,
cases where r(Sx12 >> / 0 5 / .Assuming that one chooses to follow the Ahedge
and depends on the statistics of the fluctuations of the e,. A condition for this
procedure (which is in general suboptimal, see Section 4.4.3 above), one finds that
assumption to be true will be discussed below, and requires in particular that the tail
the fluctuations of the price of the underlying leads to an increase in the value of
of the dominant factor should not decrease faster than an exponential. Fortunately,
the portfolio of the buyer of the option (since r > 0). Losses can only occur if the
this is a good assumption in financial markets.
implied volatility of the underlying decreases. If 6x and 6 a are uncorrelated (which
The aim is to compute the valueatrisk of a cewain portfolio, i.e. the \ialue 6f *
is in general not true), one finds that the 'instantaneous' variance of the portfolio is
such that the probabilit$ that the variation of f exceeds Sf * is equal to a certain
given by: i
probability p: P,(6f *) = p. Our assumption about the existence of a dominant
factor means that these events correspond to a market configuration where the
fluctuation 6ei is large, whereas all other factors are relatively small. Therefore,
where KI is the kurtosis of Sx.For an atthemoney option of maturity T , one has: the large variations of the portfoliocan he approximated as:
Typical values are, on the scale of s = one day. K, = 3 and 6u  u . The r where 6f (el) is a shonhand notation for 8f( e l , 0, . . . , 0 ) . Now, we use the fact
contribution to risk is therefore on the order of crxot/l/7;. This is equal to the that:
typical fluctuations of the underlying contract multiplied by m, or else the
price of the option reduced by a factor N = T / t . The Vega contribution is much
larger for long maturities, since it is of order of the price of the option itself.
wllese &)(I > 0) = 1 and t$(x cc 0) = 0, Expanding the C) fi~nctionto second (Ji,VdK,and therefore 8.f" > Sf (ei,lrstK). This reflects the effect of all other factors,
order lzads to: which tend to increase the valueatrisk of the portlbiio.
The result obtained ahove relies 011 a secondorder expansion; when are higher
order corrections negligible? It is easy to see that higherorder terms involve
higherorder derivatives of P ( e , ) . A condition for these terms to be negligible in
the limit p + 0, or eT + m, is that the successive derivatives of P ( e l ) become
smaller and smaller. This is true provided that P(e,) decays more slowly than
exponentially, for example as a powerlaw. On the contrary, when P(el) decays
where 6'is the derivative of the 8function with respect to 6f . In order to proceed faster than exponentially (for example in the Gaussian case), then the expansion
with the integration over the variables e, in Eq. (5.59, one should furthermore note proposed above completely loses its meaning, since higher and higher corrections
the following identity: I become dominant when p + 0. This is expected: in a Gaussian world, a large
event results from the accidental superposition of many small events, whereas in a
powerlaw world, large events are associated to one single large fluctuation which
dominates over all the others. The case where P ( e l ) decays as an exponential is
where e; 1s such that 6f(t;) = Sf *, and A: is computed for e l = e;, en, = 0. I
interesting, since it is often a good approximation for the tail of the fluctuations of
Inserting the above expansion of the O function into Eq. (5.59) and performing the
financial assets. Taking P i e , ) cr a , exp  a l e l , one finds that e: is the solution of:
integration over the e, then leads to:
(5.62) Since one has a : cx a 1 2 , the correction term is small provided that the variance of
where P ( e l ) is the probability distribution of the first factor, defined as: the portfolio generated by the dominant factor is much larger tban the sum of the
M variance of all other factors.
P ( ~ I=) S ~ ( e l . e i . . . . . e y ) n d e , . (5.63) Coming back to Eq. (5.621, one expects that if the dominant factor is correctly
a=2 identified, and if the distributian is such tbat the above expansion makes sense, an
In order to find the valueatrisk S f * , one should thus solve Eq. (5.62) for e; with approximate solution is given by e: = el*va, + F , with:
P,(8f") = p , and then compute Sf (e;. 0. . . . ,0). Note that the equation is not
trivial since the Greeks must be estimated at the solution point e;.
Let us discuss the general result, Eq. (5.621, in the simple case of a linear
portfolio of assets, such that no convexity is present: the An's are constant and where now all the Greeks at estimated at el,VaR.
the I',,,'s are all zero. The equation then takes the following simpler forin: In some cases, it appears that a 'onefactor' approximation is not enough to
reproduce the correct VaR vaIue. This can be traced back to the fact that there are
actually other different dangerous market configurations which contribute to the
V&. The above formalism can however easily be adapted to the case where two
Naively, one could have thought that in the dominant factor approximation, the (or more) dangerous configurations need to be considered. The general equations
value of e; would be the valueatrisk value of el for the probability p , defined as: read:
However, the ahove equation shows that there is a correction term proportional to
P1(eT).Since the latter quantity is negative, one sees that e; is actually larger than where a = 1, . . . . K are the K different dangerous factors. The e," and therefore
6J", are detennined by the following K conditions: deril.ctrive of the risk n.iih respect to all the @L({n]). Setting this fitt~rric~rrril
deril.ati1.e to
zero leads to:'
I
and is equal to Y ( 2 ) = max(2  x,:0).The usual case of an optiorl on tlze asset X'
thus corresponds to fi = JSiSl. A spread option on the dflerence X'  X 2 corresponds to This correction is not, in general, proportional to f i , and therefore suggests tlzat, in sotm
5 = 6i.i  8i.2. etc. The hedging ponj'olio at time k is rnade ofall the dzferent assets xi, i cases, an exogenous hedge can be useful. However; one should note that rhis correction is
slnallfor atthemonej options (f = x,), since a P ( 2 , x,, N  k ) / a x s = 0.
14:itb weight 4;. Tlze question is to determine the optinzal composition ofthe porrjblio, dJ*. I' .
Following rhe general method explained in Section 4.3.3, onefifiltdsthat the part of the risk
which depends on the strategy conrains both a linearand a quadraric feint in the I$ 's. Using In the following. i denotes the unl! imaginary number, except when it appears a a suhsc"pt, iri wh~chcase it
is an asset lahrl.
the fact that rhe En are independent randoni variables, one can conlpure the functional The case of Mvy fluctuations is also such that an exogenous hedge is useless.
5.6 References
Since the risk associated with a single option is in general nonzero, tile global More 011 oprio~zs,e.xotic options
risk of a portfolio of options ('book') is also nonzero. Suppose that the book J . C. Hull, Furcrres, Optio~isarzd Orizer Derivatives, Prentice Hall, Upper Saddle River, NJ.
contains pi calls of 'type' i (i therefore contains the information of the strike x,, 1997.
P. W. Wilmott, J. N. Dewynne, S. D. Howison. Option Pricing: Mathemciiical Models and
and maturity 7;). The first problem to solve is that of the hedging strategy. In the Compurarion; Cambridge University Press, Cambridge, 1997.
absence of volatility risk, it is not difficult to show that the optimal hedge for the N. Taleb, Dynarnical Hedging: Mar~agingVa~zilluand Exotic Optiorrs, Wiley, New York,
book is the linear superposition of the optimal strategies for each individual option: 1998.
I. Nelken (ed.), The Handbook ofExotic Oprions, Irwin, Chicago, IL, 1996.
where Cj is the price of the option i . If the constraint on the pi's is of the form
xi pi = I , the optimum portfolio is given by:
Arbitrage A trading strategy that generates profit without risk, from a zero initial
investment.
Basis point Elementary price increment, equal to in relative value.
Bidask spread Difference between the ask price (at which one can buy an asset)
and the bid price (at which one can sell the same asset).
Bond (Zero coupon): Financial contract which pays a fixed value at a given date
in the future.
Delta Derivative of the price of an option with respect to the current price of the
underlying contract. This is equal to the optimal hedging strategy in the
BlackScholes world.
Drawdown Period of time during which the price of an asset is below its last
historical peak.
Forward Financial contract under which the owner agrees to buy for a fixed price
some asset, at a fixed date in the future.
a
Futures Same as fonvard contract, but on an organized market. In this case,
the contract is markedtomarket, and the owner pays (or receives) the
marginal price change on a daily basis.
Gamma Second derivative of the price of an option with respect to the current
price of the underlying contract. This is equal to the derivative of the
optimal hedging strategy in the BlackScholes world.
* L  Hedging strategy A trading strategy allowing one to reduce, or sometinles to
. .
I eliminate completely, the risk of a position.
I
Moneyness Describes the difference between the spot price and the strike price of
an option. For a call, if this difference is positive [resp, negative], the option
is said to be inthemoney [resp, outofthemoney]. If the difference is
zero, the option is atthemoney.
Option Financial contract allowing the owner to buy [or sell) at a fixed maximum
[minimum] price (the strike price) some underlying asset in the future.
This contract protects its owner. against a pussiblz risz or fall in price of
the underlyin. asset .
Overthecounter This is said of a financial contract traded off market. say Index of symbols
between two financial companies or banks . The price is then usually not
publicly disclosed. at variance with organized markets .
Spot price The current price of an asser for immediate delivery. in contrast with.
for example. its forward price .
Spot rate The value of the shortterm interest rate .
Spread Difference in price between two assets. or between two different prices of
the same asset for example. the bidask spread .
Strike price Price at which an option can be exercised. see Option .
Vega Derivative of the price of an option with respect to the volatility of the
underlying contract . A . tail amplitude of powerlaws: P ( x )  CtAp/x'"iL. . . . . .
Value at Risk (VaR) Measure of the potential losses of a given portfolio. associ Ai tail amplitude of asset i . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ated to a certain confidence level . For example. a 95% VaR correspoi~dsto A. tail amplitude of portfolio p . . . . . . . . . . . . . . . . . . . . . . . . . .
the loss level that has a 5 % probability to be exceeded . B amount invested in the riskfree asset . . . . . . . . . . . . . . . . . .
Volatility Standard deviation of an asset's relative price changes . B ( t . 0) price. at time t . of a bond that pays I at time t B . . . . . . +
c. cumulant of order rz of a distribution . . . . . . . . . . . . . . . . . .
c cumulant of order n of an elementary distribution PI(x)
c.. N cumulant of order n of a distribution at the scale N.
P (x.N ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
C covariance matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Cij element of the covariance matrix . . . . . . . . . . . . . . . . . . . . . .
~!l'' I"tail covariance' matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
C price of a European call option . . . . . . . . . . . . . . . . . . . . . . .
CT price of a European put option . . . . . . . . . . . . . . . . . . . . . . . .
CG price of ;European call in the Gaussiau Bachelier theory
CBS price of a European call in the BlackScholes theory ....
CM market price of a European call . . . . . . . . . . . . . . . . . . . . . . .
C. price of a European call for a nonzero kurtosis K . . . . . . .
C, price of a European dall for a nonzero excess return nz . .
Cd price of a European call with dividends . . . . . . . . . . . . . . . .
CaSi price of an Asian call option . . . . . . . . . . . . . . . . . . . . . . . . . .
Cam price of an American call option . . . . . . . . . . . . . . . . . . . . . .
Cb price of a bamer call option . . . . . . . . . . . . . . . . . . . . . . . . . .
C(B) yield curve spread correlation function . . . . . . . . . . . . . . . .
Ds variance of the fluctuations in a time step s ~nthe
additive approximation: D = 0.x: . . . . . . . . . . . . . . . . . . . .
Di D coefficient for asset i . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
D, s risk associated with portfolio p . . . . . . . . . . . . . . . . . . . . . . .
.... cxplicative factor (or principal coniponent) . . . . . . . . . . . . 118 4 syminetric exponential distribution . . . . . . . . . . . . . . . . . . . .
E. nieari absolute deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 PG Gaussian distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
f ( t . tl) forwardvalueattimetoftherateattime1+8 . . . . . . . . . 73 PH hyperbolic djstribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
SF forward price . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 I)LN lognormal distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
g ( & ) autoconelation function of y,' . . . . . . . . . . . . . . . . . . . . . . . 68 Ps Student distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5;. probable gain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 P probability of a given event (such as an option being
H Hurst exponent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 exercised) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
IH Hurst function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . P< cumulative distribution: P, E P(X < x) ..............
65
I;)G. cumulative normal distribution. PG,(u) = erfc(u/&)/2
Z missing infomation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Q ratio of the number of observations (days) to the number
k time index ( I = k t ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
of assets ............................................
K. modified Bessel function of the second kind of order 31 . . 14
or quality ratio of a hedge: Q = R'jC .................
Kijki generalized . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Qlx3 tlxo. fo)
................................................... 11
riskneutral probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
L:j truncated LRvy d i s t r i b u t i o n o f o r d e r . . . . . . . . . . . . . . . . . . 14
Q. (u) polynomials related to deviations from a Gaussian . . . . . .
r n a v e r a g e return byunit time . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 r
interest rate by unit time: r =. p / s ....................
m(t. t') interest rate trend at time t' as anticipated at time t ...... 81
r(t) spot rate: r i t ) = f (t. Bmi. ) . . . . . . . . . . . . . . . . . . . . . . . . . . .
"1 average return on a unit time scale s : m I = m s . . . . . . . . 102 7
'2 risk (RMS of the global wealth balance) ...............
mi return of asset i . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 R* residual risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
inc moment of order n of a distribution . . . . . . . . . . . . . . . . . . . 6 interest rate spread: s(t) = f ( r . 6r,, )  f ( t . ......
s(t)
mp return of portfolio p . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 S(u) Crambr function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
M number of asset in a portfolio . . . . . . . . . . . . . . . . . . . . . . . . . 403 S Sharpe ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Meff effective number of asset in a portfolio . . . . . . . . . . . . . . . . 111 T time scale. e.g. an option maturity . . . . . . . . . . . . . . . . . . . . .
N number of elementary time steps until maturity: T* time scale for convergence towards a Gaussian . . . . . . . . .
N = T j t ........................................... 48 To crossover time between the additive and multiplicative
N* number of elementary time steps under which tail effects
regunes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
are important. after the CLT applies progressively . . . . . . 29 U utility function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
0 coordinate change matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 V 'Vega'. derivative of the option price with respect to
PI weight of asset i in portfolio p . . . . . . . . . . . . . . . . . . . . . . . . 108 volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
P portfolio constructed with the weights ( p i }. . . . . . . . . . . . . 109 wl:l fullwidth at half maximum . . . . . . . . . . . . . . . . . . . . . . . . . .
PI(8x1 or P. (6x). elementary return distribution on time scale s 49 AW global wealth balanck. e.g. global wealth variation
Plo distribution of resealed return Ssk/yk . . . . . . . . . . . . . . . . . . 68 between emission and maturity . . . . . . . . . . . . . . . . . . . . . . .
P (x.N)distribution of the sum of N terms . . . . . . . . . . . . . . . . . . . . . 21, : A LVs wealth balance from trading the underlying . . . . . . . . . . . .
. .
(if, ) characteristic function of P . . . . . . . . . . . . . . . . . . . . . . . . . . 7 A W,. wealth balance from transaction costs . . . . . . . . . . . . . . . . .
P ( x . tln0. t o )
x price of an asset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
probability that the price of asset X be x (within du)at
xk price at time k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
time t knowing that. at a time to. its price was xo . . . . . . . 92
. = (1 + p)kx&. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Poi*. tlxo. to)
probability without bias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 x, strike price of an option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
P m ( x . tlxo . to) x,. d median . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
probability with return 3n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 x* most probable value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
..
~lll;ixmaxinium of x in the series xi . x?. . . . . X h l . . . . . . . . . . . . . a volat~lity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
S.ri variation of x between time k and k + 1 . . . . . . . . . . . . . . . a) volatility on a unit time step: a1 = a ,/? ' ...............
sx; variation of the price of asset i between time k and k 1 + C 'implied' volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
?'A +
I o ~ ( x ~ / x~k )log(1 p ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . t elementary time step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Y (s) payoff function. e.g. Y(x) = max(x  x,. 0) .......... gi/ quantity of underlying in a portfolio at time k. for an
Z Fourier variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . option with maturity N . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Z normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Q!* optimal hedge ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Z ( u ) persistence function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Q& hedge ratio used by the market . . . . . . . . . . . . . . . . . . . . . . . .
a exponential decay parameter: P ( x ) exp(ax) . . . . . . .  @ hedge ratio corrected for interest rates:
$f = ( I + p)Nk1q5f . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
B asymmetry parameter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
or nonnalised covariance between an asset and the c=

random variable of unit
equals by definition
......................
market portfolio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..................................
scale factor of a distribution (potentially k dependent ) . . 21 is approximately equal to . . . . . . . . . . . . . . . . . . . . . . . . . . . .
derivative of A with respect to the underlying: aA/axo . .
Kroeneker delta: aij = 1 if i = j. O otherwise ..........
.
K is proportional to . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
is on the order of. or tends to asymptotically . . . . . . . . . . .
Dirac delta function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . erfc(x) complementary error function . . . . . . . . . . . . . . . . . . . . . . . .
derivative of the option premium with respect to the log(x) natural logarithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
underlying price. A = aC/axo. it is the optimal hedge Q* T(x) gamma function: T(n + 1) = n! . . . . . . . . . . . . . . . . . . . . . .
in the BlackScholes model . . . . . . . . . . . . . . . . . . . . . . . . . .
RMS static deformation of the yield curve . . . . . . . . . . . . .
Lagrange multiplier .................................
+
return between k and k I: xktl  x k = qkxk . . . . . . . . .
maturity of a bond or a forward rate. always a time
difference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
derivative of the option price with respect to time . . . . . . .

Heaviside stepfunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
kurtosis: K h4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
'effective' kustosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
'implied' kurtosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
kurtosis at scale N . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
eigenvalue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
or dimensionless parameter. modifying (for example) the
price of an option by a fraction 2. of the residual risk . . . . 
normalized cumulants: h, = c.. / o n ...................
loss level; A v a ~ loss level (or valueatrisk) associated to
a given probability PvaR .............................
exponent of a power.law. or a LCvy distribution ........
'product' variable of fluctuations Sxi6xj...............
interest rate on a unit time interval t . . . . . . . . . . . . . . . . . .
density of eigenvalues of a large matrix . . . . . . . . . . . . . . . .
Index