++
Unit 2
Sikkim Manipal University Page No.: 26
Unit 2 Random Variables
Structure
2.1 Introduction
Objectives
2.2 Onedimensional random variable
2.3 Discrete and continuous random variable
2.4 Mathematical expectation and variance
2.5 Twodimensional random variable
2.6 Marginal and conditional probability distribution
2.7 Correlation coefficient
2.8 Covariance
2.9 Summary
2.10 SelfAssessment exercise
2.1 Introduction
This unit is devoted to random variables. The outcome of an experiment
need not be a number, for example, the outcome when a coin is tossed can
be 'heads' or 'tails'. However, we often want to represent outcomes as
numbers. A random variable is a function that associates a unique
numerical value with every outcome of an experiment. The value of the
random variable will vary from trial to trial as the experiment is repeated.
There are two types of random variable discrete and continuous.
A random variable has either an associated probability distribution (discrete
random variable) or probability density function (continuous random variable).
Examples: A coin is tossed ten times. The random variable X is the number
of tails that are noted. X can only take the values 0, 1, ..., 10, so X is a
discrete random variable. A light bulb is burned until it burns out. The
random variable Y is its lifetime in hours. Y can take any positive real value,
so Y is a continuous random variable.
Statistical and Numerical Methods using C
++
Unit 2
Sikkim Manipal University Page No.: 27
Random variables are assumed to have the following properties:
1. complex constants are random variables;
2. the sum of two random variables is a random variable;
3. the product of two random variables is a random variable;
4. addition and multiplication of random variables are both commutative.
One dimensional and two dimensional random variables are given in
Section 2 and Section 5 respectively. Section 3 gives discrete and
continuous random variables. Mathematical expectations and variance are
given in Section 4. Section 6 deals with marginal and conditional probability
distribution. In Section 7 and 8 correlation coefficient and covariance are
given.
Objectives:
At the end of this unit the student should be able to:
 Describe the outcome of an experiment in term of numerical values.
 Study the correlation of two variables.
2.2 Onedimensional random variable
A function defined on a sample space is called a random variable.
A variable which takes a definite set of values with a definite probability
associated with each value of the variable is called the random variable.
Let be an experiments and S a sample space associated with the
experiments. A function X assigning to every elements S se , a real number,
X(s), is called a random variable.
For example, suppose that we toss two coins and consider the sample
space associated with this experiment, that is, S={HH, HT, TH, TT}, define
the random variable X as follows:
X is the number of heads obtained in the tosses. Hence X(HH)=2,
X(HT)=X(TH)=1, X(TT)=0
Statistical and Numerical Methods using C
++
Unit 2
Sikkim Manipal University Page No.: 28
2.3 Discrete and continuous random variable
Discrete Random Variable
Let X be a random variable. If the number of possible values of X is finite or
countably infinite, we call X a discrete random variable. That is, the possible
values of X may be listed as x
1
, x
2
, x
3
. x
n
..
In the finite case the list terminates and in the count ably infinite case the list
continues indefinitely.
Suppose we collect the data for the number of male members of families of
a certain area of a city. Obviously the numbers of male members of each
family will be in whole numbers i.e. 5, 2, 4, 15 etc., and there would be no
family where the numbers of male members would be 2.37 or 1.78 or 5.65.
The variable, which is the number of male members of a family, in this case
is called a discrete variable. {from page 44}
Example 2.1
A radioactive source is emitting particles. The emission of these particles
is observed on a counting device during a specified period of time. The
following random variable is of interest:
X=number of particles observed.
What are the possible values of X?
Solution
We shall assume that these values consist of all nonnegative integer. That
is
R
x
={0,1,2,3.,n,}
An objection which we confronted once before may again be raised at this
point. It could be argued that during a specified time interval it is impossible
to observe more than, say N particles, where N may be a very large positive
integer. Hence the possible values for X should really be: 0,1,2,3.N.
Statistical and Numerical Methods using C
++
Unit 2
Sikkim Manipal University Page No.: 29
however, it turns out to be mathematically simpler to consider the idealized
description given above. Infect when ever we assume that the possible
values of a random variable X are countably infinite.
Probability Distribution and probability mass function
The probability distribution of a random variable X is a description of the
probabilities associated with the possible values of X. For a discrete random
variable, the distribution is often specified by just a list of the possible values
along with the probability of each.
For a discrete random variable X with possible values x
1
, x
2
, , x
n
, a
probability mass function is a function such that
(1) f(x
i
) 0
(2)
=
=
n
1 i
i
1 ) x ( f
(3) f(x
i
) = P(X = x
i
)
Continuous Random Variable
Suppose that the range space of X is made up of a very large finite number
of values, say all values x in the interval 0 x 1 of the form 0, 0.01, 0.02, .
. . ,0.98, 0.99, 1.00. With each of these values is associated a nonnegative
number p(x
i
) = P(X = x
i
), i= 1, 2, . . . , whose sum equals 1.
It might be mathematically easier to idealize the above probabilistic
description of X by supposing that X can assume all possible values,
0 x 1. if we do this. Since the possible values of X are noncountable, we
cannot really speak of the ith value of X, and hence p(xi) becomes
meaningless. We defined only x
1
, x
2
, , by a function f defined for all values
of x, 0 x 1.
Statistical and Numerical Methods using C
++
Unit 2
Sikkim Manipal University Page No.: 30
Definition
X is said to be a continuous random variable if there exists a function f,
called the probability density function of X, satisfying the following
conditions:
(a) f(x) 0 for all x
(b)
}
+
dx ) x ( f =1
(c) for any a, b, with  < a < b +, we have
P(a X b) =
}
b
a
dx ) x ( f
Example 2.2
Let X be the life length of a certain type of light bulb. Assuming X to be a
continuous random variable, we suppose that the pdf f of X is given by
f(x) = (a / x
3
) , s 1500 x 2500
= 0,
elsewhere, that is, we are assigning probability zero to the events {X<1500}
and {X>2500}. To evaluate the constant a, we invoke the condition
}
+
=1 dx ) x ( f
which in the case become dx x / a
2500
1500
3
}
=1.
From this we obtain a=7, 031, 250.
Probability Distribution and probability mass function
Density functions are commonly used to describe physical system.
Probability density function f(x) can be used to describe the probability
distribution of a continuous random variable X. If an interval is likely to
contain a value for X, its probability is large and it corresponds to large
Statistical and Numerical Methods using C
++
Unit 2
Sikkim Manipal University Page No.: 31
values for f(x). The probability that X is between a and b is determined as
the integral of f(x) from a to b.
For a continuous random variable X, a probability density function is a
function such that
(1) f(x) 0
(2)
}
=1 dx ) x ( f
(3) P(a X b) =
}
b
a
dx ) x ( f = area under f(x) from a to b for any a and b.
2.4 Mathematical expectation and variance
Mathematical expectation
Let x be a random variable which can take the values x
1
, x
2
, x
3
,, x
n
with
corresponding probability p(x
1
), p(x
2
), p(x
3
), , p(x
n
) then the expectation of
x is defined by
E(x) =
n
1 r
r r
) x ( p x
If the series
) x ( p x
r r
is absolutely convergent, then x is said to have a
finite expectation and if ) x ( p  x 
r r
is divergent, then x is said to have
no finite expectation.
Note
(1) The expectation of a sum of two random variables is equal to the sum of
their expectation.
Symbolically E(x + y) = E(x) + E(y)
(2) The expectation of a product of two independent random variables is
equal to the product of their expectation.
Symbolically E(x y) = E(x) E(y)
Statistical and Numerical Methods using C
++
Unit 2
Sikkim Manipal University Page No.: 32
Example 2.3
What is the expected value of the number of points that will be obtained in a
single throw with an ordinary die?
Solution
The values of the variant i.e. numbers which can appear on the uppermost
face of the die are 1,2,3,4,5,6 and each has the probability = (1/6)
E(x) =
i i
x p = (1/6).1 + (1/6).2 + (1/6).3 + (1/6).4 + (1/6).5 +(1/6).6
= (1/6) [1+2+3+4+5+6]
= 7/2
Example 2.4
From a bag containing 2 rupeecoins and 3 twenty paise coins, a person is
asked to draw two coins at random. Find the value of his expectation.
Solution
There are three following possibilities of drawing two coins at random.
(1) both of these coins be rupeecoins, its probability =
2
2
2
5C
C
=
10
1
(2) both of these coins be twentypaise coins, and its probability =
10
3
20
6
4 . 5
2 . 3
C
C
2
5
2
3
= = =
(3) one of these coins be a rupeecoin and the other a twenty paise coin
and its probability =
5
3
4 5
6 2
C
C . C
2
5
2
3
1
2
=
= also 2 rupeecoin=10
twentypaise coins
1 rupeecoin and 1 twentypaise = 6 twentypaise coins.
Required expectation =
i i
x p
Statistical and Numerical Methods using C
++
Unit 2
Sikkim Manipal University Page No.: 33
(
(


.

\

+


.

\

+


.

\

6
5
3
2
10
3
10
10
1
twenty paise coins.
=
5
26
5
18
5
3
1 = + + twenty paise coins
= 20
5
26


.

\

paise
=104 paise
Variance
Suppose that for a random variable X we find that E(X) equals 2.It simply
means that if we consider a large number of determination of X, say x
1
,
x
2
,, x
n
, and average these values of X, this average would be close to 2 if
n is large. For example suppose that X represents the life length of light
bulbs being received from a manufacturer, and that E(X) = 1000 hours. This
could mean one of several things. It could mean that most of the bulbs
would be expected to last somewhere between 900 hours and 1100 hours.
It could also mean that the bulbs being supplied are made up to two entirely
different types of bulbs: about half are of very poor quality and will last about
700 hours.
There is an obvious need to introduce a quantitative measure which will
distinguish between such situations.
The square of standard deviation i.e.
2
is known as variance. It is also
known as the second moment of dispersion.
Let X be a random variable. We define the variance of X, denoted by V(X) or
2
x, as follows:
 
2
) X ( E X E ) X ( V =
The positive square root of V(X) is called the standard deviation of X and is
denoted by
x
o .
Statistical and Numerical Methods using C
++
Unit 2
Sikkim Manipal University Page No.: 34
Properties
(1) If C is a constant, V(X+C)=V(X).
(2) If C is constant, V(CX)= C
2
V(X)
(3) If X and Y is a two dimensional random variable, and if X and Y are
independent then V(X+Y)=V(X)+V(Y)
Example 2.5
If the S.D. of a series is 15.6. What shall be the value of variance?
Solution
Given that = 15.6, so we have
Variance =
2
=
(15.6)
2
= 243.36
Example 2.6
The weather bureau classifies the type of sky that is visible in terms of
degree of cloudiness. A scale of 11 categories is used: 0, 1, 2, , 10,
where 0 represents a perfectly clear sky, 10 represents a completely
overcast sky, while the other values represent various intermediate
conditions. Suppose that such a classification is made at a particular
weather station on a particular day and time. Let X be the random variable
assuming one of the above 11 values. Suppose that the probability
distribution of X is
p
0
= p
10
= 0.5
p
1
= p
2
= p
8
= p
9
= 0.15
p
3
= p
4
= p
5
= p
6
= p
7
= 0.06
Hence
E(x) = 1 (0.15) + 2(0.15) + 3(0.06) + 4(0.06) + 5(0.06) + 6(0.06) + 7(0.06) +
8(0.15) + 9(0.15) + 10(0.05) = 5.0
In order to compute V(X) we need to evaluate E(X
2
)
E(X
2
) = 1(0.15) + 4(0.15) + 9(0.06) + 16(0.06) + 25(0.06) + 36(0.06) +
49(0.06) + 64(0.15) + 81(0.15) + 100(0.05)
Statistical and Numerical Methods using C
++
Unit 2
Sikkim Manipal University Page No.: 35
= 35.6
Hence
V(X) = E(X
2
) (E(X))
2
= 35.6 25
= 10.6
and the standard deviation = 3.25.
Example 2.7
Suppose that the random variable X is uniformly distributed over [a, b]. Then
we have E(X) = (a + b)/2
To compute V(x) we evaluate E(X
2
)
Solution
We have
E(X) = (a + b)/2
E(X
2
) = dx
a b
1
x
b
a
2
}
=
(
(
3
a b
a b
1
3 3
Hence
V(X) = E(X
2
) (E(X))
2
=
(
(


.

\
 +
(
(
2
3 3
2
b a
3
a b
a b
1
After a simple computation
V(X) =
( )
12
a b
2
Statistical and Numerical Methods using C
++
Unit 2
Sikkim Manipal University Page No.: 36
2.5 Two dimensional random variable
In one dimensional random variable the outcome of the experiment could be
recorded as a single number x. In many situations, we observe two or more
numerical characteristics simultaneously. For example, the hardness H and
the tensile strength T of a manufactured piece of steel may be of interest,
and we would consider (h, t) as a single experimental outcome. We might
study the height H and the weight W of some chosen person, giving rise to
the outcome (h, w). Finally we might observe the total rainfall R and the
average temperature T at a certain locality during a specified month, giving
rise to the outcome (r, t).
Definition
Let be an experiment and S a sample space associated with . Let X=X(s)
and Y=Y(s) be two functions each assigning a real number to each
outcomes S s e .
We call (X,Y) a two dimensional random variable.
(a) Let (X, Y) be a twodimensional discrete random variable. With each
possible outcome (x
i
, y
j
) we associate a number p(x
i
, y
j
) representing
P(X = x
i
, Y = y
j
) and satisfying the following conditions:
(1) p(x
i
, y
j
) 0 for all (x, y)
(2)
=
=
1 j 1 i
i i
1 ) y , x ( p
The function p defined for all (xi, y
j
) in the range space of (X, Y) is
called the probability function of (X, Y). The set of triples (x
i
, y
j
, p (x
i
,
y
j
)), i, j = 1, 2,. . ., is sometimes called the probability distribution of
(X, Y).
(b) Let (X, Y) be a continuous random variable assuming all values in
some region R of the Euclidean plane. The joint probability density
function f is a function satisfying the following condition:
Statistical and Numerical Methods using C
++
Unit 2
Sikkim Manipal University Page No.: 37
(1) f (x, y) 0 for all (x, y) eR,
(2)
}}
=
R
1 dxdy ) y , x ( f
2.6 Marginal and conditional probability distribution
Marginal probability distribution
If more than one random variable is defined in a random experiment, it is
important to distinguish between the joint probability distribution of X and Y
and the probability of each variable individually. The individual probability
distribution of a random variable is referred to as its marginal probability
distribution.
If X and Y are discrete random variable with joint probability mass function
f
xy
(x,y), then the marginal probability mass function of X and Y are
(x)=P(X=x)=
Rx
xy
) y , x ( f and f
y
(y) = P(Y=y) =
Ry
xy
) y , x ( f
where R
x
denotes the set of all points in the range of (X,Y) for which X=x
and R
y
denotes the set of all points in the range of (X,Y) for which Y=y
Conditional probability distribution
When two random variable are defined in a random experiment, knowledge
of one can change the probability that we associate with the values of the
other.
Given discrete random variables X and Y with joint probability mass function
f
xy
(x,y) the conditional probability mass function of y given X=x is
( ) ( ) ) x ( f / y , x f y f
X XY
x
y
= for f
X
(x)>0
The function f
Y/x
(y) is used to find the probabilities of the possible values for
Y given that X=x. that is, it is the probability mass function for the possible
values of Y given that X=x.
Statistical and Numerical Methods using C
++
Unit 2
Sikkim Manipal University Page No.: 38
A conditional probability mass function f
Y/x
(y) is a probability mass function
for all y in Rx, the following properties are satisfied:
(1) f
Y/x
(y) > 0
(2)
=
Rx
x / y
1 ) y ( f
(3) P(Y=y/X=x)= f
Y/x
(y)
2.7 Correlation coefficient
Correlation is one of the most widely used statistical techniques. Whenever
two variable are so related that a change in one variable result in a direct or
inverse change in the other and also greater the magnitude of change in
one variable corresponds to greater the magnitude of change in the other,
then the variable are said to be correlated or the relationship between the
variables is known as correlation.
We have been concerned with associating parameters such as E(x) and
V(X) with the distribution of onedimensional random variable. If we have a
twodimensional random variable (X,Y), an analogous problem is
encountered.
Definition
Let (X, Y) be a twodimensional random variable. We define
xy
, the
correlation coefficient, between X and Y, as follows:
xy
=
) Y ( V ) X ( V
)]} Y ( E Y )][ X ( E X {[ E
The numerator of , is called the covariance of X and Y.
[Note the correlation coefficient is a dimensionless quantity.]
Statistical and Numerical Methods using C
++
Unit 2
Sikkim Manipal University Page No.: 39
Example 2.8
Suppose that the twodimensional random variable (X, Y) is uniformly
distributed over the triangular region
R = {(x, y)  0 < x < y < 1}
The pdf is given as
f(x, y) = 2, (x, y)eR,
= 0, elsewhere.
Thus the marginal pdfs of X and of Y are
g(x) =
}
1
x
dy ) 2 (
2 (1 x), 0sx s1
h(y) =
}
y
0
dx ) 2 (
= 2y, 0sys1
Therefore
E(X) =
3
1
dx ) x 1 ( 2 x
1
0
=
}
, E(Y) =
}
=
1
0
3
2
ydy 2 y
E(X
2
) =
6
1
dx ) x 1 ( 2 x
1
0
2
=
}
, E(Y
2
) =
}
=
1
0
2
2
1
ydy 2 y
V(X) = E(X
2
) (E(X))
2
=
18
1
V(Y) = E(Y
2
) (E(Y))
2
=
18
1
E(XY) =
}}
=
1
0
y
0
4
1
dxdy 2 xy
Hence
xy
=
) Y ( V ) X ( V
)]} Y ( E Y )][ X ( E X {[ E
=
2
1
Statistical and Numerical Methods using C
++
Unit 2
Sikkim Manipal University Page No.: 40
Degree of correlation
We can find the degree of correlation with the help of coefficient of
correlation. The following degrees of correlation are there:
(a) Perfect correlation
When two variable change in the same direction and in the same ratio, then
there is perfect positive correlation.
In this case the coefficient of correlation is +1.
On the other hand if two variables change in the same ratio but in the
opposite direction, then there is perfect negative correlation and in this case
the coefficient of correlation is 1.
(b) Absence of correlation
If the change in one variable has no effect on the other variable, then the
correlation is completely absent. In this case, coefficient of correlation is 0.
(c) Limited correlation
If there is neither complete presence nor complete absence of correlation
between two variables then in such a state we say that there is limited
correlation and it can be positive as well as negative.
2.8 Covariance
When two or more random variables are defined on a probability space, it is
useful to describe how they vary together, that is, it is useful to measure the
relationship between the variables. A common measure of the relationship
between two random variables is the covariance.
To define the covariance, we need to describe the expected value of a
function of two random variable h(X,Y). The definition simply extends that
used for a function of a single random variable.
Statistical and Numerical Methods using C
++
Unit 2
Sikkim Manipal University Page No.: 41
( )   { }
= ) y , x ( f ) y , x ( h Y , X h E
xy
X,Y discrete
}}
R
R
xy
dxdy ) y , x ( f ) y , x ( h X,Y continuous
The covariance between the random variables X and Y, denoted as
cov(X,Y) or
xy
is
xy
=
( )( )  
y x y x
) XY ( E Y X E =
Covariance is a measure of linear relationship between the random
variables.
2.9 Summary
In this unit onedimensional random variable, discrete and continuous
random variable, mathematical expectation and variance, twodimensional
random variable, marginal and conditional probability distribution, correlation
coefficient and covariance were discussed.
2.10 SelfAssessment exercise
1. Two unbiased dice are thrown. Find the expected value of the sum of
numbers of points drawn.
(Ans: 7)
2. A and B throw with one dice for a prize of Rs.11 which is to be won by a
player who first throws 6. if A has the first throw, what are their
respective expectation?
(Ans: As= Rs.6 , and Bs = Rs. 5)
3. Thirteen cards are drawn simultaneously from a pack of cards. If aces
count 1, face cards 10 and others according to denomination, find the
expectation of the total score on 13 cards.
(Ans: 85)
Statistical and Numerical Methods using C
++
Unit 2
Sikkim Manipal University Page No.: 42
4. What is the expectation of the number of failures preceding the first
success in an infinite series of independent trials with constant
probability p of success in a trial?
(Ans: (1p)/p)
5. Let a random variable x take the values
k
2 . ) 1 (
x
k k
k
= ; k=1,2,3 with
probabilities k=2
k
. find E(x).
(Ans: log2)
6. Let the continuous random variable X denote the diameter of a hole
drilled in a sheet metal component. The target diameter is 12.5
millimeters. Most random disturbances to the process result in larger
diameters. Historical data show that the distribution of X can be modeled
by a probability density function f(x) = 20e
20(x 12. 5)
, x 12.5
(Ans: 0.865)
7. Suppose that the twodimensional random variable (X, Y) has pdf given
by f(x, y) = ke
y
, 0 < x < y < 1
= 0, elsewhere
Find the correlation coefficient.