Вы находитесь на странице: 1из 21

Random Number Generators and

Monte Carlo Integration

Xander van den Eelaart


Lecturer Bas Fagginger Auer

First Report Laboratory Class Scientific Computing

Utrecht University
Department of Mathematics
November 14, 2011

Contents
1 Introduction

2 Random Number Generators


2.1 Schrages Algorithm and Controlled Overflow . . .
2.2 Implementation of Linear Congruential Generators
2.3 Distribution of the Random Numbers . . . . . . . .
2.3.1 The Inversion Method . . . . . . . . . . . .
2.3.2 The Rejection Method . . . . . . . . . . . .
2.3.3 Non-Uniform Distributions Tested . . . . . .
2.4 Testing Random Number Generators . . . . . . . .
2.4.1 Random Walk . . . . . . . . . . . . . . . . .
2.4.2 Multidimensional Structure . . . . . . . . .
2.4.3 Bit Correlation . . . . . . . . . . . . . . . .
2.4.4 Chi-square Test . . . . . . . . . . . . . . . .
2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

2
3
3
4
4
5
5
7
7
10
11
12
13

3 Monte Carlo Integration


3.1 One-Dimensional Monte Carlo Integration . . .
3.1.1 Hit-or-miss on [0, 1] [0, 1] . . . . . . . .
3.1.2 Simple Sampling . . . . . . . . . . . . .
3.1.3 Area of a Circle . . . . . . . . . . . . . .
3.2 Multi-dimensional Monte Carlo Integration . . .
3.2.1 Multi-dimensional Hit-or-miss . . . . . .
3.2.2 Discrepancy Sampling . . . . . . . . . .
3.2.3 Volume of a ddimensional Unit Sphere
3.3 Conclusion . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

13
14
14
14
15
16
16
17
18
19

References

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

20

Introduction

Random numbers are the basis of various computational problems, amongst which
is Monte Carlo integration. What are random numbers and how do we obtain
them? These questions have a philosophical nature, because true randomness is
rare. Rather, scientists use the notion of pseudo-random numbers, which are also
the topic of research in the first part of this report. It is not possible for a computer
to produce random numbers, because a computer is deterministic. However, there
exist many algorithms for generating pseudo-random numbers and these appear to
be quite good. What is good in this sense? In [2] a practical approach is taken: a
random number generator (RNG) is good when it is effective. An RNG is in turn
effective when it produces statiscally the same results in an application as a true
RNG. This can be determined by theory, or by comparing the RNG to a true RNG,
one that is based on a truly random process.
This report will firstly introduce a class of RNGs called linear congruential
random number generators. Some of these are good and some are efficient in terms
of computation time. The quality demands on RNGS are discussed and a set of
linear congruential generators will be subjected to tests on their randomness. In
the second part of this report, we will use such RNGs to perform Monte Carlo
integration. This technique is based on the assumption that the area under a graph
is proportional to the chance that a random point will be inside this area. Monte
Carlo integration will be compared to other numerical methods and the analytical
integration.

Random Number Generators

As mentioned in the introduction, although computers are completely deterministic, they are capable of producing sequences of numbers that appear random. As
the name gives away, linear congruential generators are sequential RNGs. They
produce sequences of random numbers in the sense that the next random number
is completely determined by the previous one. Linear congruential generators are
based on the following two formulas:
f (x) = (ax + c) mod m

(1)

xi+1 = f (xi )

(2)

Here, x is the random number, a is called the multiplier, c the increment, and m
is the modulus. An RNG of this type can only produce at most m different values,
because of the modulo operation. Therefore, m is also referred to as the maximum.
Another important property of an RNG is the period. After producing a certain
amount n of different numbers, the RNG returns the value of x0 again. This will
then lead to the same sequence x0 to xn1 . The period is bounded by the maximum
m of the generator. A last notion is the seed of the generator, this is the starting
value x0 , which must in some way be provided to the RNG.

2.1

Schrages Algorithm and Controlled Overflow

The task is now to find suitable values for a, b and m. The modulo m should
in general be quite large, for two reasons. First, we want the period to be large,
because repeated sequence of numbers are not random. As the period is bounded by
m, m should be large. Secondly, we want the random numbers to be able to take
many different values, because in many cases we want to approach a continuous
random variable. Computation time, however, puts a constraint on how large m
and the other parameters can be. If m is 232 1, then it can be fit in a 32 bit
integer. The part ax + c can take much higher values than 232 and as a result, the
complete computation cannot be performed in 32 bit arithmetic.
This problem can be solved in some cases by using Schrages algorithm. This algorithm provides a way for performing the operation ax mod m without exceeding
m.
x
ax mod m = a(x mod q) r" #
(3)
q
If this is positive, if not m should be added. This holds only for r > q, where r and
q are given by:
m
q=" #
a
r = m mod a
Another trick that is used is called controlled overflow. If m is 232 , then in 32
bit arithmetic the modulo operation occurs automatically, because the computer
deletes the most significant bits such that ax+c fits into 32 bits. Because controlled
overflow is much faster than a modulo operation, a popular choice for m used to be
232 .

2.2

Implementation of Linear Congruential Generators

We will use several different implementations of linear congruential RNGs, which


will be compared to eachother in this report. The parameters for the chosen generators are listed in Table 1.
Name
a
c
m
Number
0
RANDU
65539
0
231
1
Quick
1664525
1013904223
232
31
2
Park-Miller
16807
0
2 1
3
Noname
65539
0
231 1
4
Bad
5
0
27
8
Sun
1103515245
12345
231
9
GSL
1103515245
12345
231
31
10
Standard
?
?
2 1

Table 1. Parameter values for several used linear congruential generators.


The RANDU generator was a much used RNG until it was discovered that it has bad
randomness, as mentioned by [1]. The Quick generator has a modulo of 232 and
earned its name because the modulo operation does not take time, as explained
3

in the previous section. The Park-Miller generator has been suggested by [4] as
a minimal standard as to which the randomness of other generators should be
compared, because of its good randomness. It is also mentioned that Park-Miller
has full period, which is a good property. The Noname and Bad generators have
uncommon choices for their parameters, but will serve as interesting subjects of our
tests. The GSL generator is taken from the rand() functions of the GNU Scientific
Library, and has been used as the standard generator on Unix for many years. As
we will show, GSL has correlations in the lower bits of its random numbers, which
is bad for randomness. Sun circumvented this problem by cutting off the lowest
16 bits of the GSL random numbers. As a result, Sun and GSL have the same
parameters, but deliver different random numbers. Lastly, the Standard generator
uses the systems built in rand() function. It has functions to set the seed, generate
the next random number, and obtain the maximum, but does not have functions
to obtain the parameters a and c.
Schrages trick cannot be applied to the GSL, Sun and Quick generators because
they have c $= 0. Of the others, only Park-Miller and Bad have r < q. The bad
generator has m = 75 and as m2 < 232 , there is no need for Schrages trick because
the calculation can be performed in 32 bits. Consequently, the Park-Miller generator
is the only generator for which I applied Schrages algorithm. It has q = 127773
and r = 2836.

2.3

Distribution of the Random Numbers

Random numbers have different applications, which need random numbers in different ranges. We do not want to have a different RNG for every application. Rather,
we would like an RNG that produces random numbers that can be transformed to
any range. Therefore, it is easiest to have our RNG generate random numbers with
a uniform distribution on [0, 1]. This means that we want a continuous random
variable with a probability density function (pdf), f (x) that is 1 for 0 x 1 and
0 otherwise. We will see two ways in which this distribution can be transformed to
any other required distribution.
The values a, c and m of linear congruential generators should be chosen in such
a way that the random numbers between 0 and m will be distributed evenly. This
is for example the case when the RNG has full period, a period of m. Such an RNG
will return all numbers between 0 and m once when generating m random numbers.
If we divide numbers of such an RNG by m, we will get values between 0 and 1,
which are evenly distributed. In this way, the RNG generates random numbers that
appear to be from a uniform distribution.
We will next discuss two techniques of generating random numbers from a nonuniform distribution. These are the inversion method and the rejection method.
2.3.1

The Inversion Method

The inversion method is based on a property of the inverse of the cumulative density
function (cdf) of a random variable. The inverse F 1 of a cdf F is given by:
F 1 (q) = inf{u : F (u) q}
4

(4)

The values of a cdf range from 0 to 1, because firstly there exists no such thing
as negative probability. The second reason is, because the probability of a random
variable to take any value is equal to 1. Subsequently 0 < q < 1. This definition of
the inverse is explained in words as follows: it is the function that gives for a value
of q the lowest value of u such that the cdf at u takes equal to, or higher than q.
For cdfs that are strictly monotone increasing and continuous, this definition of the
inverse just gives the function F 1 (q) that gives the value u such that F (u) = q.
The inversion method is based on the following theorem, which is given, but not
proven, in [1], [2] and [3].
Theorem 1 Suppose U is a random variables which is uniformly distributed on
[0, 1]. Then the random variable X given by X = F 1 (U) has the cumulative
distribution function F , where F 1 is the inverse of F .
Consequently, when the inverse of the cdf of the required distribution is known,
the RNG can generate numbers from this distribution by simply calculating F 1 (xi )
for each random number xi from the uniform distribution. However, the inverse is
not always available, in which case the rejection method can work.
2.3.2

The Rejection Method

The rejection method is easiest explained by a simple example. Suppose we want


to obtain random numbers from the distribution f (x) = 2x for 0 x 1 and
f (x) = 0 otherwise. Now generate two random numbers x1 and y1 from a uniform
distribution. The number x1 is accepted if 2y1 f (x1 ). Else, x1 is rejected and
two new random numbers are drawn to test. Higher values for xi are more likely to
be accepted, because the value for f (xi ) is going to be higher and vice versa. The
function f (x) gives the probability with which x is selected, exactly as we wanted.
Visually, this can be interpreted by generating random points in the rectangle
that stretches from x = 0 to x = 1 and from 2y = 0 to 2y = 2. A point is then
selected, if it is under the graph f (x) = 2x.
This method can be generalized to any pdf f . We require a pdf g which is non
zero everywhere where f is also non zero, such that there is a c for which for all x
we have
cg(x) f (x)
(5)
In the previous example, g was the uniform distribution, and c = 2. Now, a random
number x1 is generated from g, and another random number y1 from the uniform
distribution on [0, 1]. Then x1 is accepted, if it holds that:
f (x1 ) cy1 g(x1 )

(6)

Which is in fact the same rule used as in the example, with g(x) = 1 and c = 2
plugged in.
2.3.3

Non-Uniform Distributions Tested

Suppose we would like to generate random numbers from the non-uniform distribution f (x) = 3x2 on [a, b]. The easiest method is to take g(x) to be the uniform
5

1
distribution on [a, b], so g(x) = ba
. We have to choose c such that the product
cg is higher or equal to the maximum of f . The maximum is either f (a) or f (b).
Then, c = max{f (x)|x [a, b]}(b a) and cg = max{f (x)|x [a, b]}.
Now generate two random numbers x1 and y1 from a uniform distribution on
[0, 1]. We need x1 to be from the distribution g, which is easily done by rescaling x1
as such: x1 := a + (b a)x1 ;. Then Equation (6) is simplified to x1 being accepted
if:
f (x1 ) y1 max{f (x)|x [a, b]}

I have used the Park-Miller generator to obtain 10 000 000 random numbers from
f (x) = 3x2 and counted the number of xi s that were not accepted, the number of
rejections. For a = 2 and b = 1.5 the number of rejections was 26 923 245. In this
case, we generated 3.7 times more points than were needed. On the interval a = 0
and b = 3, this number is smaller; 20 004 146 rejections were counted.
In order to bring down the number of rejections, we can pick g more carefully. A
closer distribution is the line going through a and b. We write g(x) = dx + e, where
(a)
the slope of the line is given by d = f (b)f
, and the intersection by e = f (a) d a.
ba
Now, we require Theorem 1 to generate numbers from g. The problem is that
g is not normalized, which is important for applying Theorem 1. Let G be the cdf
of g, and G1 its inverse. The maximum of G is not 1 anymore, but some value k.
As a result, the domain of G1 is not [0, 1] anymore, but [0, k]. Therefore, when
using Theorem 1, we should not take G1 (x) where x is uniform, but G1 (kx). This
scaling constant k is in fact the value of the integral of g over [a, b].
The cdf G(x) of g(x) is found by integrating:
! x
! x
g(x)dx =
g(x)dx
G(x) =
!

1
(dx + e)dx = [ dx2 + ex]xa
2
a
1
1
= dx2 + ex da2 + ea
2
2
As mentioned, we also need the integral of g over [a, b] for the scaling constant k.
It is easy to find now:
! b
1
1
g(x)dx = db2 + eb da2 + ea
k=
2
2
a
=

As G(x) is continuous and strictly increasing, the inverse is just given by the
quadratic formula:
"
e2 2d 12 da2 ea x
G1 (x) =
d
For Equation (6), we require xi to be of distribution g. This is done by taking xi
to be from the uniform distribution on [0, 1], and then transforming it by xi :=
G1 (kxi ). The criterium for xi to be selected now becomes
f (xi ) g(xi )yi
6

Where yi is uniform on [0, 1]. We see that c = 1, because we did not normalize g,
with the result that g(x) f (x) for all x [a, b].
I applied this procedure to f (x) = 3x2 on [2, 1.5] to obtain 10 000 000 random
numbers using Park-Miller. The number of rejections was 18 852 090, a reduction
of more than 8 million. On the interval [0, 3], there were only as little as 5 000 585
rejections. With the help of Figure 1 below, the difference in rejections is easily
explained. The area under the red line represents all the random points xi , yi that
were generated. The white part are the rejected xi s, and the blue part are the
accepted ones.

Figure 1. In blue is the histogram of the accepted xi s of the rejection method for the
distribution f (x) = 3x2 . The red line is the height of the histogram of all generated
xi s

2.4

Testing Random Number Generators

To determine which generators are good, we have to set criteria. As aforementioned,


a generator should have a large period, because the correlation between the repeated
sequences means a generator is a sign of bad randomness. A small period can be
discovered by plotting random walks. Secondly, we require the numbers to not show
any structure or correlations. For this, we will plot the points in 2 or 3 dimensions to
look for structure, and inspect the bit correlations. Lastly, we require the numbers
to be uniformly distributed. However, they should not be distributed too uniformly.
The chi-square test gives a good criterium for a uniform distribution of random
numbers.
2.4.1

Random Walk

In 2 dimensions, we define our random walk as follows. Take a random number. If


the number is in [0, 0.25], we go left one distance unit. If it is in [0.25, 0.5] we go to
the top one unit. If it is in [0.5, 0.75], we go right and if it is in [0.75, 1], we go to
the bottom. Then take the next random number and repeat.

Because the direction in which the point moves is determined by a random number, it is called a random walk. The random walk should not show any structure,
and should not repeat itself. If a part of the walk is repeated, this is likely because
its period is small. On the other hand, it could be that the random numbers are
not repeated, but only the order in which they fall into the set intervals. This is
also a sign of bad randomness, because it means that not all intervals have equal
chance of occuring when the previous random number is known.
I have depicted the random walks for the generators of Table 1 in Figure 2 on the
next page. We notice that the Noname and Bad generators have repetitions. For
the Bad generator, the period can be easily seen by inspecting the random number
sequence: every 32nd number is the same. For the Noname generator, the period
is much larger. We also notice that the walks for the GSL and Sun generators are
the same. This is because the 16 most significant bits are the same, and division
by their maximum will put these numbers in the same interval.

Figure 2. Random walks of the RNGs of Table 1. The seed was set to 1, and
10,000,000 random numbers were generated for each generator. Left, from top to
bottom: RANDU, Park-Miller, Bad, GSL. Right: Quick, Noname, Sun, Standard.

2.4.2

Multidimensional Structure

The multidimensional structure can be shown by plotting the points (xi , xi+1 ), (xi+o ,
xi+o+1 ), ... for a sequence of random numbers xi . Here, the dimension d is 2, and
d o is the overlap of the points. This can also be plotted for 3 dimensions. In [1],
it is stated that all such points of a linear congruential generator lie on a number
of axes in kspace. This becomes bad for lower dimensions and lower number of
axes, because the autocorrelations for the random numbers is higher.
Upon inspection of the two dimensional plots of points, I have not found any
structure for all generators except for the Bad generator. Its two dimensional plot
is depicted in Figure 3 below. Its points lie on a clear rectangular lattice.

Figure 3. Two dimensional plot of numbers from the Bad generator with overlap 1.
When inspecting the three dimensional structure, it is important to look from all
angles. This is seen when considering the plot of the numbers from RANDU in
Figure 4 below. Only from certain angles, the three dimensional structure can be
seen clearly. The points of RANDU appear to lie on a number of parallel planes. I
have found this same structure for the Noname generator. For the other generators,
I did not find any structure.

Figure 4. Two sides of the same three dimensional plots of numbers from the
RANDU generator with overlap 1.
10

2.4.3

Bit Correlation

In decimal notation, the random numbers of the generators appear to be random


when considering a sequence smaller than the period. In bit notation, however,
many generators can have clear correlations in the less significant bits. Therefore, I
have inspected the lower bits of sequences of random numbers from all generators.
I found the following correlations.
The RANDU generator always has ones in the first least significant bit. This
means all random numbers are odd. In the second least significant bit, a period of
size two is present, where a 1 on this spot in a random number is followed by a 0
on this spot in the next random number. The third least significant bit is always
a zero. The fourth least significant bit has period 4, where in a sequence of four
random numbers there is first two times a one on this spot and then two times a
zero. The fifth least significant bit has period 8 and a sequence of numbers has
11110000 on these spots.
The Quick generator shows a period two on the first least significant bit, which
goes from 1 to 0. The second bit has period 4 and goes through 0011. The third
bit has period 8 and shows a somewhat more irregular sequence: 10110100.
The Bad generator has ones on the first least significant bit and zeros on the
second. The third has period 2 going through 10. The fourth least significant bit
has period 4 and goes through 11110000. The fifth bit was the last one with a
clear correlation. It has period 8 and the bit shows the order 00001111 when going
through a sequence of random numbers.
The GSL generator has an interesting correlation structure. I noticed that the j th
least significant bit has a period of 2j . I have checked this until the fifth least significant bit. The sequences are as follows. First: 01. Second: 1100. Third: 11110000.
Fourth: 0001000111101110. Fifth: 00111001110101111100011000101000. This is
the reason why Sun made a generator based on GSL that cuts of the 16 least significant bits of GSLs numbers. As a result, the Sun generator has maximum 215 ,
but it does not show correlations in its bits.
In [1], it is stated that for linear congruential generators with a modulo equal
to a power of two, there is a clear maximum for the period of the bits. In specific it
states on page 7 that for generators with modulo equal to a power of two it holds
that:
The period of the lowest order bit is at most 1 (i.e., it is always the
same); the period of the next lowest order bit is at most 2; the period
of the next lowest bit is at most 4; and so on.
The generators RANDU, Quick, Bad, and GSL (and Sun) all have a modulo
equal to a power of two. The RANDU and Bad generator satisfy this statement;
GSL and Quick, however, do not. The period of their lowest order bits is double of
what is stated in this citation. My intuition is that this is caused by the fact that
[1] forgets to mention that this statement holds only for c = 0. If c $= 0, as is the
case for Quick and GSL, the maximum bound on the period is twice as large.

11

2.4.4

Chi-square Test

As aforementioned, we would like the random numbers to be evenly distributed,


but not too much either. We check the distribution by dividing [0, 1] up into M
bins. For each of these bins, we count how many random numbers it contains. In
Figure 5 below, the uniform distribution of the Park-Miller generator is displayed
for two different sequence lengths n and M = 20.

Figure 5. Histograms of the uniformly distributed random numbers from the ParkMiller generator. Left: 1000 numbers. Right: 1000000 numbers
What is a good criterium for determining whether a distribution is good or bad
and how is the dependence on n and the bin size M incorporated? For this, we will
consider the 2 -distribution of the deviation V from the bin counts. This requires
many different V s, for an accurate representation. V is given by
V =

M
#
(yj Ej )2

Ej

j=1

(7)

n
. Now we take
Where Ej is the expectation value of the bin count which equals M
many V s and make sure that we use different sequences of random numbers for each
V . These V s have a 2 -distribution with M 1 degrees of freedom. This distribution
is widely used in statistics to provide cut off values for a certain confidence interval.
We take a two-sided confidence interval of 95 percent, which means that we have
an x1 and x2 between which is 95 percent of the area of the 2 -distribution.
I have tested all generators with 10000 different V s. Each V divides up another
sequence of 1000 random numbers over M = 20 bins. I have counted the fraction
of V s lying outside the confidence interval [x1 , x2 ]. These fractions are depicted in
Table 2 below.

12

Generator Fraction of cut off V s


RANDU
0.0488
0.0505
Quick
Park-Miller
0.0514
Noname
0.0591
1
Bad
Sun
0.0534
GSL
0.0498
Standard
0.0490
Table 2. Fractions of the 10000 V s outside of the 95% confidence interval of the
2 -distribution for M = 20. For each V , a sequence of 1000 random numbers was
taken
These fractions should be close to 0.05, or 5%, in order for the distribution of V s to
be similar to a 2 -distribution. It should be noted that this test is dependent on the
choice for the number of V s, the bin size M, the number of random numbers per V ,
and the confidence interval. We could run another test with other values, but these
are not necessarily better if the total of random numbers used does not increase.
What is remarkable in Table 2 is the difference between Sun and GSL, because their
numbers should all be quite close together. Apparently, it is something with the
bin size M = 20 that brings up this small difference. For the Bad generator, every
V was cut off, which is caused by the small period. On the one hand, this does not
allow the bins to smoothen out for larger n, and on the other hand, it means that
all V s are (almost) equal.

2.5

Conclusion

The following generators have showed a lack of randomness. The generators Noname
and Bad have a low period as shown by the random walk. The generators RANDU,
Noname, and Bad show correlations between points in multi-dimensional space.
The generators RANDA, QUICK, Noname, Bad, and GSL showed correlations
between bits of sequential random numbers. The Bad generator showed a clear non
2 -distribution.
This leaves the Park-Miller, Sun, and Standard generators unblamished. According to [4], the Park-Miller generator has full period and thus has 23 1 2 different points. The Sun generator has a maximum of 215 different numbers, a factor
216 less than Park-Miller. For the Standard generator, however, we do not know
if it has full period. For this reason, I have chosen the Park-Miller generator for
performing Monte Carlo integration.

Monte Carlo Integration

We can now use our RNGs from Section 2 for an application: Monte Carlo integration. This is the statistical approximation of an integral of a function over a certain
domain. The simplest Monte Carlo method is the hit-or-miss integration of a one
13

dimensional function. This is in fact much like the rejection method for generating
a non-uniform distribution. We sample points in area of which the size is known
and which contains the domain and codomain of the function over which we want
to integrate. Then for each point it is determined whether it is under the graph
of the function. The ratio of these points times the area in which the points were
sampled will give an estimation of the integral.
The main advantage of Monte Carlo methods for integration is their relative
simplicity in multiple dimensions. It can be used to estimate integrals over complicated or weird shaped volumes, as long as a simple second volume can be found
that encloses the first. A second advantage is the good convergence of the error of
Monte Carlo integration.
Here we will use Monte Carlo methods to estimate the integral or the volume of
a d dimensional unit sphere. We will start off with the estimation for d = 2, which
is a circle.

3.1
3.1.1

One-Dimensional Monte Carlo Integration


Hit-or-miss on [0, 1] [0, 1]

The area of a two dimensional unit circle centered at the origin equals four times
the area of this circle in the upper right plane (x, y 0). Consequently, we can
use two uniformly distributed variables on [0, 1] for the hit-or-miss integration of
a function f (x). Generate two such random numbers x1 and y1 . The probability
that the point [x1 , y1] is under the graph of the function f (x) equals the area of
f (x) in this domain and codomain. For n such random points, call na the number
of points under the graph of f (x). The ratio nna is an estimate for the area under
f (x). For n , this estimation will be exact. We would like to know how the
error decreases as we increase n.
As argued by [3], na has a binomial distribution for which the expected value E
of the (absolute) error is derived to be:
$
E(f )(1 E(f ))
na
(8)
E(| E(f )|)
n
n
Where E(| nna E(f )|) is the expected value of the absolute error, or the difference
between the estimation and the actual value of the integral. E(f ) is actually the
expected value of f . On the domain [a, b], this expected value or the average of f
can be used to calculate the integral as the product of width times average height:
it is (b a)E(f ). Here b = 1 and a = 0, so E(f ) equals the integral.
In general, the expected value of f is unknown, else we would not be performing
the Monte Carle integration. It is however only necessary to note that the error
1
decreases by the order O(n 2 ). Increasing the number of random points by 4 leads
to a reduction of factor 2 in the error.
3.1.2

Simple Sampling

We discussed in the previous section that the integral of f (x) on [a, b] equals (b
a)E(f ). Why not estimate E(f ) directly? This is what the simple sampling Monte
14

Carlo method does. We need n random numbers xi from a uniform distribution on


[a, b], and we just average their value of f (x):
n

1#
f (xi )
E(f ) =
n i=1

(9)

This holds for n , else this is just another estimation. The expected value
of the error of the simple sample method is proven by [3] to be:
$
n
Var(f )
1#
(10)
E(
f (xi ) E(f ))
n i=1
n

Again, we do not know Var(f ), but we note that the error again decreases by
1
order O(n 2 ).
3.1.3

Area of a Circle

Take the function f (x) = 1 x2 on [0, 1]. It is the quarter of the unit circle in
the upper right plane. The integral I for this function is given by:
! 1
! 1
I=
f (x)dx =
1 x2 dx
0

=
4
I have approximated this integral by calculating nna of the hit-or-miss method,
and E(f ) from Equation (9). I have calculated the absolute difference of these
estimates with I for several values and plotted them in Figure 6 below. I have used
the Park-Miller generator with seed 1. I have also plotted the error of the hit-ormiss method given by Equation (8). The error of simple sample also decreases with
1
order O(n 2 ). It should therefore be parallel to the error of hit-or-miss.

15

Error of one dimensional MC integration


2^0

2^5

Error

2^10

2^15

2^20

2^25
2^0

2^5

2^10

2^15

2^20

2^25

2^30

Number of points
Figure 6. A log-log plot, base 2, of the error against n. Blue plus signs: hit-or-miss
error. Red astrices: simple sample error. Blue line: the expected error for hit-ormiss from Equation (8)
The errors of both hit-or-miss and simple sample show a quite good reduction of
1
order O(n 2 ).

3.2
3.2.1

Multi-dimensional Monte Carlo Integration


Multi-dimensional Hit-or-miss

d-dimensional hit-or-miss Monte Carlo integration is a simple generalization of the


one-dimensional case. Instead of generating points in a plane, generate points #xi
(the superscript denotes the ith random point in d + 1 space, and the subscript is
used for vector components) in a d + 1-dimensional volume. This volume should
enclose the graph of the function f to integrate. Then we count the number na of
points #xi that satisfy
f (xi1 , xi2 , .., xid ) xid+1

When f : [0, 1]d [0, 1], we can use uniform distributions of the components of the
points #xi . The total volume of the domain and codomain considered is 1d+1 = 1,
and the volume I under the graph of f is given by the ratio:
I=

na
n

16

(11)

Where n is the total number of random points generated. When calculating the
volume of the d-dimensional unit sphere, it should be noted that it is defined on
[1, 1]d . However, it is convenient to generate points from the uniform distribution
on [0, 1]. As a result, we only estimate a fraction of the volume. In the case of
the circle, this was a factor 14 of the total area. For a sphere it is a factor 18 of
the volume. For a d-dimensional unit sphere, we estimate a factor 21d of the total
volume. Therefore, we multiply Equation (11) by 2d to arrive at the estimate of the
total volume Vd of a d-dimensional unit sphere:
na
(12)
Vd = 2 d
n
The function f that is needed in order to check if a point is in the sphere or not, is
given by the Pythagoras formula in d 1 dimensions:
"
f (xi1 , xi2 , ..., xid1 ) = (xi1 )2 + (xi2 )2 +, ..., +(xid1 )2
(13)
Thus the criterium becomes that the point #xi lies inside the sphere if
"
"
(xi1 )2 + (xi2 )2 +, ..., +(xid1 )2 1 (xid )2

(14)

Which simplifies to the criterium for accepting if:

(xi1 )2 + (xi2 )2 +, ..., +(xid1 )2 + (xid )2 1

(15)

The error of the multidimensional hit-or-miss Monte Carlo integration converges


1
similar to the one-dimensional case with O(n 2 ). The difference, however, is that
n is the number of points in d-dimensional space, so in total nd random numbers
are required.
3.2.2

Discrepancy Sampling

In search of a higher order convergence, we arrive at discrepancy sampling which


is a quasi Monte Carlo method. Instead of randomly generating points in a hyper
cube, we want to have points distributed in this hyper cube as evenly as possible.
For a variable number of points n, it is not an easy task to divide the points over the
hyper cube by hand. Instead, we use discrepancy sampling, in which subsequent
points divide up the space evenly.
The discrepancy sampling works as follows. To obtain the j th point #xj , the number j is expanded in prime bases. For the ith vector component of #xj , j is expanded
in the ith prime basis Pi , where we begin counting at 2: P1 = 2, P2 = 3, P3 = 5....
The expansion of the ith coordinate of the j th point is the best approximation for
aki Z for:
j a1i Pi + a2i Pi2 + a3i Pi3 + a4i Pi4 + a5i Pi5 + ...
(16)
The value of the ith coordinate of the j th point equals:

xji = a1i Pi1 + a2i Pi2 + a3i Pi3 + a4i Pi4 + a5i Pi5 + ...

(17)

The discrepancy sampling is best seen in practice. In Figure 7 below, I have plotted
two dimensional discrepancy sampled points for n = 50 and n = 500. It can be
seen how the points form a regular structure and spread out over the plane.
17

Figure 7. 2-dimensional points created using discrepancy sampling. Left: n = 50


points. Right: n = 500 points
We can now use the points generated by simple sampling to estimate the volume
of a ddimensional unit sphere. We simply have to count the number of points na
that satisfy the criterium in Equation (15). The volume is then given by Equation
(12). The error of discrepancy sampling is stated by [3] to decrease with order
1
O(n1 log d (n)). This is better than the O(n 2 ) of hit-or-miss.
3.2.3

Volume of a ddimensional Unit Sphere

I have estimated the volume of a ddimensional Unit Sphere using the hit-or-miss
Monte Carlo method and the discrepancy sampling quasi Monte Carlo method. I
have taken n = 226 points in each space, and calculated the error for every n equal
to a power of two less or equal to 226 and higher than 26 by using the analytical
value of the volume given by:
d

2
Vd = d
( 2 + 1

(18)

Where is the gamma function. The order of convergence of the error was then plotted using the third n-error point as a starting point. For the discrepancy sampling,
I have plotted both the order O(n1 log d (n)) convergence, as well as its approximation for small d: O(n1 ). These points and lines are depicted in Figure 8 below for
d = 2, 3, 6 and 12.

18

Figure 8. Four log-log plots, base 10, of the error of estimating the volume of a
ddimensional unit sphere against n. Blue dots: hit-or-miss error. Green dots:
discrepancy sampling error. Blue dotted line: expected error for hit-or-miss with
1
order O(n 2 ). Green dotted line: expected error for discrepancy sampling with order O(n1 log d (n)). Green solid line: expected error for discrepancy sampling with
order O(n1)
1

Observing Figure 8, the order O(n 2 ) of the convergence of the error of the hit-ormiss estimation shows quite well. The order O(n1 log d (n)) for discrepancy sampling
shows up less well. It seems for d = 6 and d = 12 that the error converges more
1
like O(n 2 ). For d = 2 and d = 3, the convergence of the error for discrepancy
sampling does seem faster than for the hit-or-miss method.

3.3

Conclusion
1

The expected convergence of the error with order O(n 2 ) of hit-or-miss Monte
Carlo integration shows up. Referring back to the introduction of this report,
this shows that our RNG, the Park-Miller generator, is a good RNG, because it
works. The expected convergence of the error of simple sampling Monte Carlo
1
integration of order O(n 2 ) also shows up. For discrepancy sampling Monte Carlo
19

integration, however, the error seems to converge slower than its expected order of
O(n1 log d (n)). I remain inconclusive about whether discrepancy sampling or hitor-miss Monte Carlo integration is better, although [1], [2], and [3] predict a faster
convergence of the error.

References
1. J. Gentle Random Number Generation and Monte Carlo Methods, SpringerVerlag, New York, 1998.
2. W. Press, S. Teukolsky, W. Vetterling, and B. Flannery, Numerical Recipes
in C: the art of scientific computing, second ed., Cambridge University Press,
1992.
3. B. Fagginger Auer, A. Yzelman, A. van Dam, and A. Swart Laborator Class
Scientific Computing: Course Book, Utrecht University, 2011.
4. S. Park and K. Miller Random Number Generators: Good ones are hard to
find, Communications of the ACM 31 (1988), no. 10, 1192-1201.

20

Вам также может понравиться