Академический Документы
Профессиональный Документы
Культура Документы
Random variables are complicated objects, containing a lot of information on the experiments
that are modeled by them. If we want to summarize a random variable (RV) by a single
number, then this number should undoubtedly be its central tendency. The expected value,
also called the expectation or mean, gives the center in the sense of average value of the
distribution of the random variable.
1.1
Example 1 (Average Life of a Drill Bit) An oil company needs drill bits in an exploration project. Suppose that it is known that (after rounding to the nearest hour) drill bits
of the type used in this particular project will last 2, 3, or 4 hours with probabilities 0.1, 0.7,
and 0.2. If a drill bit is replaced by one of the same type each time it has worn out, how
long could exploration be continued if in total the company would reserve 10 drill bits for the
exploration job?
Let X denote life of a drill bit, which is a random variable that takes values 2, 3, or 4
hours. Question is, if we want to generate an informed estimate of how long the exploration
job will last, which value of X should we use? A logical way to answer this question can be
to use some central tendency, such as average value of X; to answer the question. Since
the probability mass associated with each value of X is dierent, i.e.
P (X = 2) = 0:1; P (X = 3) = 0:7; P (X = 4) = 0:2
(1)
we can use these probabilities to compute a weighted average lifeof a drill bit as follows
X = 0:1
2 + 0:7
3 + 0:2
4 = 3:1
(2)
and then conclude that the exploration could continue for 10 3:1 = 31 hours.
This weighted average is what we call the expected value or expectation of the random
variable X; whose distribution is given by equation (1). It might happen that the company
is unlucky and that each of the 10 drill bits has worn out after two hours, in which case
exploration ends after 20 hours. At the other extreme, they may be lucky and drill for 40
1
hours on these bits. However, it is a mathematical fact that the conclusion about a 31-hour
total drilling time is correct in the following sense: for a large number N of drill bits the
total running time will be around N times 3.1 hours with high probability.
Denition 2 The expectation of a discrete random variable X taking the values a1 ; a2 , . .
. and with probability mass function
fX (ai ) = P (X = ai ) =
for i = 1; 2; ::::
fX (a) = 0 otherwise
is the number
E [X] =
ai P (X = ai ) =
X
i
ai fX (ai ) =
i ai
(3)
We also call E [X] the expected value or mean of X. Since the expectation is determined
by the probability distribution of X only, we also speak of the expectation or mean of the
distribution.
Looking at an expectation as a weighted average gives a more physical interpretation of
this notion, namely as the center of gravity of weights i = fX (ai ) placed at the points ai
(see Figure 1). This point of view also leads the way to how one should dene the expected
value of a continuous random variable.
Let X be a continuous random variable whose probability density function fX (x) is zero
outside the interval [a; b]. Now, let us divide the interval into n small sub-intervals of equal
size as follows
xi = a + i x for i = 0; 1; :::n
(4)
where
(5)
n
Thus, we have x0 = a and xn = b: It seems reasonable to approximate X by a discrete
random variable, Y , taking the values
x=
(6)
Y = x1 ; Y = x2 ; ::::::; Y = xn
the interval [a; b] with as probabilities the masses that X assigns to the intervals [xi 1 ; xi ];
i.e.
Zxi
fX (x)dx
x [fX (xi )]
(7)
X xi ) =
i = P (Y = xi ) = P (xi 1
xi
Zxi
xi
fX (x)dx
x [fX (xi )] =
fX a +
i
(b
n
a)
(8)
In other words, we have approximated (X; fX (x)) as (Y; fY (y)): The center-of-gravity interpretation suggests that the expectation E[Y ] of Y should approximate the expectation E[X]
of X, i.e.
n
n
X
X
E[X] E[Y ] =
( x [fX (xi )]) xi
(9)
i xi =
i=1
i=1
Note that E[X] is indeed the center of gravity of the mass distribution described by the
function f; i.e.
Z1
x fX (x) dx
Z1
1
E[X] = Z1
=
x fX (x) dx
(11)
1
fX (x) dx
for i = 1; 2; ::::
we can unify the denition of mean of a random variable, if we express the probability mass
function as follows
n
X
fX (x) =
ai )
i (x
i=1
It follows that
E[X] =
Z1
x fX (x) dx =
Z1
n
X
i=1
(x
ai ) dx
i ai
i=1
Now consider scenario where we are interested in computing expected value of some function
Z = g(X) of a random variable X. Such situations are very often encountered in engineering
applications.
Example 5 (Modied Drill Bit Example) Consider the drill bit example with the following additional assumptions. Let us assume that there three dierent quality drill bits
available in the market: low quality (with life X=2 hrs), average quality (with life X=3 hrs)
and high quality (with life X=4 hrs) and that dierent quality drill bits are stocked in dierent rooms. Moreover, on a given day, let us further assume that (a) all the drill bits issued
to an operator are of identical quality and (b) the room is chosen randomly. In other words,
on a given day, an operator can receive either 12 low quality drill bits or 8 average quality
drill bits or 6 high quality drill bits for carrying out the drilling operation. The associated
probability mass function is
P (X = 2) = 0:1; P (X = 3) = 0:7; P (X = 4) = 0:2
(12)
Under these constraints, assume that the daily operating cost of the exploration can be expressed as a function of the number of drill bit changes needed in 24 hrs as follows
Z = C(24=X)
Rs=day
Let us further assume that the resistance is a monotonically increasing function of the temperature of the form
R = A + B(T a)2
where constants (A,B) are known. Obviously, the resistance is also a random variable and it
is di cult to design the circuit for a randomly changing value of resistance. Thus, a designer
would be interested in nding an average value of R that can be used for designing the circuit.
Thus, if X is a random variable, then a function of X, i.e. Z = g(X); is also a random
variable. The question is, if the probability density/mass function, fX (x); is known, how do
we nd average value of g(X)?
3.1
To begin with, let us assume that X is a discrete RV which can take values a1 ; a2 :::::and
has associated probability mass function fX (x): By virtue of the fact that X is a RV, it
follows that Z is a discrete RV that takes values z1 = g(a1 ); z2 = g(a2 ); ::::and so on.
Moreover, probability that X = ai is equal to i = fX (ai ) implies that the probability
that Z = zi (= g(ai )) is also equal to i . This leads us to the expected value of a function
of a discrete RV.
Denition 7 Consider a discrete random variable X taking the values a1 ; a2 , . . . and with
probability mass function fX (x). The expectation of a function g(X) of X is the number
X
X
E [g(X)] =
g(ai ) P (X = ai ) =
(13)
i g(ai )
i
Example 8 Consider the modied drill bit example. For the transformed variable, Z, we
can nd the probability mass function as follows
P (Z = 24C=2) = P (X = 2) = 0:1
(14)
P (Z = 24C=3) = P (X = 3) = 0:7
(15)
P (Z = 24C=4) = P (X = 4) = 0:2
(16)
12C + 0:7
8C + 0:2
6C = 8C
Note that
E [Z] 6= g(X) = 24C=3:1 = 7:742C
In fact, E [Z] 6= g(X) in general and the equality holds only when the transformation is
linear.
6
3.2
Now, let X represent a continuous random variable with probability density function fX (x)
and
Zx
FX (x) = P ( 1; x] =
fX ( ) d
1
Let us further assume that fX (x) is zero outside the interval [a; b] and fX (x)
0 when
x 2 [a; b]. Consider the approximation of X using a discrete RV Y which takes values
fx1 ; x2 ; :::xn g as given by equations (4) and (5) together with associated probability mass
function
x [fX (xi )]
(17)
i = P (Y = xi ) =
where x [fX (xi )] is given by equation (8). Now, consider a transformation Z = g(Y ), that
takes discrete values z1 = g(x1 ); z2 = g(x2 ); ::: and so on. Since we have P (Y = xi ) =
x [fX (xi )] = i ; it is obvious that
P (Z = g(xi )) = P (Y = xi ) =
x [fX (xi )]
Now, using denition of mean of a function of a discrete RV given by equation 13, it follows
that
X
E [Z] =
g(xi ) P (Z = g(xi ))
i
i g(xi ) =
(18)
Z
n
lim X
( x [fX (xi )]) g(xi ) = g(x)fX (x)dx
n ! 1 i=1
b
This motivates the the denition of expected value of function of a continuous RV.
Denition 9 Consider a continuous random variable X with probability density function
fX (x): The expectation of a function Z = g(X) of X is the number
E[Z] = E [g(X)] =
Z1
g(x)fX (x) dx
(19)
Z1
(20)
fZ ( ) d
Example 10 Continuing with the L-R-C circuit design example, the average value of resistance R can be computed as follows:
Approach 1: Using fT (x)
E[R] = E [g(T )] =
(b
g(x)fT (x) dx =
A(b
a)
1
(b
a)
Zb
a)2 ) dx
(A + B(x
(21)
Z1
a) +
B
(b
3
a)3 = A +
B
(b
3
a)2
(22)
This result also shows that the average value of the function is not equal to the value of the
function evaluated at T = T = (a + b)=2, i.e.
g(T ) = A + B
a)2
(b
6= E[g(T )]
a)2
r) = P (A + B(T
P T
a+
A
B
a+
pr A
Z B
r) = P T
r A
B
a+
fT (x) dx =
pr A
Z B
a
a+
A
B
a, we can write
1
b
dx =
1
b
"r
A
B
Z1
1
rfR (r) dr = p
2 B(b
RZ
b A
A
t+ p
t
ZRb
rdr
1
p
= p
r A
2 B(b
dt =
2 3=2
t + 2At1=2
3
a)
Ra
RZ
b A
a)
(t + A)dt
p
t
Ra A
Rb A
Ra A
Ra A
Since
A = 0 and Rb
Ra
a)2
A = B(b
it follows that
2 3=2
t + 2At1=2
3
Rb A
=
Ra A
and
2
B(b
3
a)2
1
E[R] = A + B(b
3
3=2
a)2
+ 2A B(b
1=2
a)2
If Z = g(X) is a complex function or integral of fT (x) does not have a closed from
expression, then deriving the probability density function, fZ ( ) can prove to be a di cult
task. For example,assume that the probability density function of the temperature in the
L-R-C circuit design example is given as follows
h
i
(
)
(x )2
exp
for
x
2
[a;
b]
2
2
fT (x) =
0
otherwise
where
2 [a; b] and
FR (r) =
pr A
Z B
a+
fT (x) dx =
pr A
Z B
a
exp
"
(x
)
2
dx
does not have a simple closed form solution. Thus, if it is desired to compute only E[X],
then Approach 1 is generally preferred over Approach 2.
central tendency of X. This could be done by nding c such that some function of deviation
jx cj is minimized. Let us dene
n
= E [jX
cjn ]
(23)
and seek c that minimizes n for dierent values of n: In particular, consider minimizing
c)2 , which expands to
2 = E (X
2
= E X2
2Xc + c2 = E X 2
2cE[X] + c2
(24)
2E[X] + 2c = 0
(25)
= E[X]
(26)
and
[ 2 ]min = E (X
E[X])2
(27)
Thus, the mean is the best single representative of the theoretical centroid of a random
variable if we are concerned with minimizing mean squared deviation from all possible values
of X [2]. Moreover, the quantity [ 2 ]min = E (X E[X])2 is known as variance of RV X.
Remark 11 Minimizing 1 = E [jX cj] w.r.t. c yields the median of the distribution while
minimizing n as n ! 1 w.r.t. c yields the mode of the distribution [2].
References
[1] Dekking, F.M., Kraaikamp, C., Lopuhaa, H.P., Meester, L. E., A Modern Introduction
to Probability and Statistics: Understanding Why and How, Springer, 2005.
[2] Ogunnaike, B. A., Random Phenomenon: Fundamentals of Probability and Statistics for
Engineers, CRC Press, 2010.
10