Академический Документы
Профессиональный Документы
Культура Документы
PY
CO
ON
TI
EC
SP
IN
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
Technical Note
This technical note can be used by an instructor over 15 one-and-half hour sessions
to provide a crisp introduction to essential ideas in probability for management stu-
PY
dents. This material has been used to teach an introductory probability course at IIM
Ahmedabad for first year management (PGP) students. While there are many classic
text books in probability, it is often hard to adapt them to teach management students
with diverse backgrounds in limited time, while also ensuring adequate mathematical
CO
rigor. The aim of this note is to enable the instructor to introduce the subject through
a set of carefully chosen problems, some purely engineered to understand the concept
and some that are motivated by real applications. Hence, the style of presentation has
been to provide a crisp description of ideas interspersed with carefully chosen exer-
cises. Most exercises include an answer key and at times a more elaborate hint depend-
ing on the level of difficulty. Some of the exercises have been created by the author,
some adapted from books and some borrowed from assignments or exams from past
ON
courses at IIM. The books referred while preparing this note are listed at the end of this
note. These books can also serve as useful references for the instructor using this note.
TI
EC
SP
IN
2
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C ONTENTS Technical Note
PY
Contents
CO
1 The Concept of Probability 5
2 Enumeration Principles 9
3
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C ONTENTS Technical Note
Introduction
PY
We live in a world of uncertainty and we are often forced to make decisions with lim-
ited information. For example, which company share should I invest in?, where in the
ocean should we search for the missing flight?, whether a person should be convicted
for a crime based on the evidence presented?, etc. To all these questions, we would
CO
agree that there is no single answer. Instead, there are a set of possibilities or outcomes.
Further, of these outcomes some may be more likely and some less. Information such
as this is captured by a "Probability Model:: an approximate mathematical description of
a phenomenon involving uncertainty".
". Simply speaking a probability model specifies all
possible outcomes and the probability of each outcome. In practice, such models are
formulated partly based on our understanding of the context and partly constructed
based on analysis of historical data. Statistics provides a framework to systematically
ON
collect and analyze data to formulate and estimate such models.
Once a good model is known, it can be used to make decisions. For example, a com-
Distribution It turns out, as we
monly used model for log of stock returns is the Normal Distribution.
will learn later, that for a Normal model it is enough to know two parameters, namely
Mean or Expected Value and the Variance. Therefore, past data can be analyzed to
TI
draw inferences about the mean and variance of log returns. Once we have a handle
on these parameters, the probability model is known. Then, we can compute various
quantities of interest that could help in decision making, for example the probability
EC
that log returns exceed certain value. Also, knowing this for various stocks, one can
strategize to create a portfolio with desired level of expected returns and uncertainty .
data analysis.
IN
4
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 1: T HE C ONCEPT OF P ROBABILITY Technical Note
PY
C HAPTER 1
CO
Most of us already have a notion of probability. Consider the following questions:
The answer is ofcourse 50% or 0.5. What do we mean by this though?. It simply
ON
means that if I keep tossing the coin a large number of times, then approximately
50% of the time I would get heads. If I do it long enough, i.e. infinitely many
times I will get exactly 50% heads.
2. What is the probability that AAP will lose the next Delhi assembly election?
EC
The answer to this is less straightforward and may depend on each individuals
view point. Suppose I say there is an 80% chance. What does this mean?. Clearly,
this does not have the relative frequency interpretation, for we cannot repeat the
elections under identical conditions infinitely many times!. Instead, it can be in-
"degree of belief".
terpreted as a "degree belief"
SP
Both the above interpretations are fine for our understanding whenever they make
sense. Before formally defining probability, let us understand some associated termi-
nology.
5
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 1: T HE C ONCEPT OF P ROBABILITY Technical Note
the interval (0, policy Limit], i.e. the claim value can be any value between zero
and the policy limit. The former is a "discrete" sample space and the latter is a
"continuous" sample space.
PY
Event is a subset of the sample space. If the event consists of a single outcome, it
is a "Simple Event", In Example 1 A={HH} is a simple event. If an event consists
of multiple outcomes it is a "compound event". In Example 1. A={HH, HT, TH}
is the compound event that at least one toss is a head. In Example 2 B=[10000,
50000] is the compound event that a claim amount is at least 10000 but not more
CO
than 50000.
We will often work with combination of various events. Towards that end lets
recall a few notations and facts from set theory:
(iii) A B is the intersection set of A and B, which contains all the those elements
that belong to both A and B.
TI
(iv) Similarly, we can of course talk about unions and intersection of multiple
(possibly infinitely many) events, denoted by iN=1 Ai and iN=1 Ai respectively.
Note that N can potentially be infinity ().
(
EC
(v) Ac is the compliment if A, which contains all those elements of S that do not
belong to A.
6
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 1: T HE C ONCEPT OF P ROBABILITY Technical Note
Exercises
PY
b. List the elements that make up the following events: A=sum of the two values
is 5, B=Value of first die is higher than second, C= the first value is
CO
2. Show that ( A B)c = Ac Bc and in general that (iN=1 Ai )c = iN=1 Aic
ON
We are now ready to formally define probability.
Exercises
4. Show that P( A B) = P( A) + P( B) P( A B)
SP
Can you guess and write down the general formula for P(in=1 Ai )?
7
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 1: T HE C ONCEPT OF P ROBABILITY Technical Note
PY
(hint: All possible combinations of (balls in cell 1, balls in cell 2, balls in cell 3). Let
B1, B2, B3 denote the balls then, examples of elements of S would be ({B1,B2,B3},
{ }, { }), ({B1,B2}, {B3}, { } ) , ({B1}, {B2}, {B3}) etc.
b. Let S1 be the event that cell 1 is singly occupied S2 be the event that cell 2 is
singly occupied, findP(S1 ), P(S2 ), P(S1 S2 ).
CO
322 32
(Ans: P(S1 ) = P(S2 ) = 33
, P ( S1 S2 ) = 33
)
8. A readership survey conducted among the adult population showed that 35%
read Times, 15% read Express and 25% read Herald; 10% read both Times and
Express, 8% read both Express and Herald, 5% read both Times and Herald; 4%
read all three publications. If an adult in the city is chosen at random, what is the
probability that:
ON
(i) He does not read any newspaper.(ii) He reads only one of the newspapers. (iii)
He reads exactly two of the newspapers. (iv) He reads all three papers.
(Hint: Define events A=Reads Times, B= Reads Herald, C=Reads Express. Ans:(i)
44%, (ii) 41% (iii)11%, (iv)4%.)
TI
EC
SP
IN
8
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 2: E NUMERATION P RINCIPLES Technical Note
PY
C HAPTER 2
Enumeration Principles
CO
Much of computing probability has to do with counting number of outcomes that sat-
isfy some criteria. Here are some basic counting principles.
Exercises
TI
nk=0 nck .
(Hint: Consider n cells which can either be filled or left empty. How many ways
can we do this?)
n!
Permutations n Pr = Number of ordered ways of choosing r objects out of n = (n
Permutations: r )!
.
IN
Exercise: There are 10 brands (B1 , B2 , , B10 ) in the market which have been
ranked by a market research firm. If you believe that this ranking is random,
what is the probability that the first three brands are B1 , B2 , B3 in that order?.
9
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 2: E NUMERATION P RINCIPLES Technical Note
1
(Ans: 10P3 )
Exercises
PY
4. The numbers 1, 2, 3, , n are ordered randomly. What is the probability that (a)
1 and 2 are neighbors in that order (b) 1,2 and 3 are neighbors in that order.
(hint: In (a), for counting number of favorable cases consider the string "12" as a
( n 1) ! ( n 2) !
single entity. Take a similar approach for (b). Ans: (a) n! , (b) n!! .
n )
CO
5. You own Brand A of a certain product and there are 9 other competing brands.
What is the probability that your brand is among the top 3 in terms of consumer
preference?.
39!
(Ans: 10! )
(Answer: Imagine the r balls placed side by side thus creating (r-1) gaps between
them. (* * * * ...*). Imagine the n cells as being created by n + 1 sticks; (| |
| |....|), with the sticks at either extreme being fixed and the rest (n 1) of the
TI
sticks placed either in any of the (r-1) gaps or after the extreme * at either end. (e.g.
***|**| | ****||). Now solving the problem amounts to counting the number of
arrangements of (n 1 + r ) objects where (n-1) are of one type (|) and rest r of
EC
( n 1+r ) !
another type (*). Hence answer is (n 1 + r )Cr = (n1)!r!
9. How many ways can we put r indistinguishable objects in n cells so that each cell
contains at least 1 object ?.
(Hint: In previous problem take (r-n) instead of r and ( x1 1), ( x2 1), ...etc..
Ans: (r 1)Cn1 )
IN
10. There are n objects such that n1 are of type 1, n2 of type 2,.., nr of type r. Objects of
each type are not distinguishable among themselves. How many distinguishable
arrangements of these n objects are possible?
10
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 2: E NUMERATION P RINCIPLES Technical Note
n!
(Ans: n1 !n2 !nr ! ).
11. In a city 7 accidents happened in a week.Choose the correct answer along with
appropriate justification. The probability that all 7 happened on different days of
PY
the week if
(a) the accidents are distinguishable and each distinguishable distribution of ac-
cidents across 7 days have equal probability.
CO
accidents across 7 days have equal probability
(Hint: First consider a simpler example with 2 accidents on 2 days and work out
explicitly without using any formulas. Then think about the above problem.)
(Note: The assumptions under part (a) and part (b) relate to two important the-
ON
ories in Physics for the distribution of atomic particles into a set of energy states.
The assumption under part(a) relates to "Maxwell-Boltzmann" Statistics which is
statistical behavior exhibited by some distinguishable particles and (b) relates to
"Bose-Einstein" statistics which is the statistical behavior exhibited by indistin-
guishable particles.)
TI
12. There are n flag poles and r flags. r1 flags are red, r2 are blue and r1 + r2 = r.
What are the number of distinguishable ways in which you can hoist these flags?.
(assume that a flag pole can hoist at most 1 flag and that you need to hoist all flags.
also, n r).
EC
rr!!
(Ans: nCr r1 !!rr2 ! )
Here is a question to demonstrate how the counting exercises like the above can
be useful in statistical analysis. Such analysis is referred to as theory of runs. This
can be useful when building models to see whether the deviations of actiual data
SP
from the model are indeed random. Typically, if it is random then we have a good
model because whatever is not explained by the model is just random noise.
13. There are 14 seats of which 6 are vacant/empty (E) and 8 are occupied (O) in the
following pattern EEEOOEOOOEEOOO. The question of interest is whether the
IN
seating is random. The approach taken is to ask what is the probability of getting
6 runs. ( Each continuous occurrence of Es or that of Os is termed as a "run").
Compare this with the pattern EEEEEEOOOOOOOO, i.e what is the probability
of getting 2 runs.
11
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 2: E NUMERATION P RINCIPLES Technical Note
(61)C31 (81)C31
(Ans: P(6runs) = (8+6)C6
14%, P(2runs) = 0.07%. The second
pattern is highly unlikely if the seating is indeed random. Therefore, if such a
pattern is actually observed then we have evidence to believe that the seating
PY
must not have been done at random.)
CO
ON
TI
EC
SP
IN
12
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 3: C ONDITIONAL P ROBABILITY, B AYES T HEOREM ... Technical Note
PY
C HAPTER 3
CO
Theorem, Independence of Events
a subset of the original sample space B = {3, 6}. Among the two there is no reason
why one outcome should be more likely than the other. Therefore, the new probability
for event {3} is 1/2. What we have worked out here is the conditional probability of
EC
.
P( B)
IN
13
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 3: C ONDITIONAL P ROBABILITY, B AYES T HEOREM ... Technical Note
Exercises
PY
For a fixed event B with P( B) > 0, let Q( A) = P( A| B).
b. What is Q( Bc )?.
CO
2. Consider families with 2 children. If one child of a family is a boy, what is the
probability that the other child is a girl?
(Ans: 2/3)
3. An insurance company sells a number of different policies; among these 60% are
for autos, 40% are for home owners, 20% are for both. Suppose a person is picked
at random from the population of policy holders. Let A1 be event that he has only
ON
an auto policy, A2 be the event that he has only a home owners policy, A3 be the
event that he has both auto and home owners and A4 be the event that he has
neither auto nor home owners.
a. Find P( A1 ), P( A2 ), P( A3 ), P( A4 ).
TI
b. Let B be the event that the person renew atleast one of auto or homeowners.
From past experience it is known that P( B| A1 ) = .6, P( B| A2 ) = .7 and P( B| A3 ) =
.8. Given that a person selected at random has an auto or a home owners policy,
EC
what is the probability that she will renew at least one of them?.
14
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 3: C ONDITIONAL P ROBABILITY, B AYES T HEOREM ... Technical Note
PY
A1: Test 4,885 73,630 78,515
Positive
A2: Test 115 921,370 921,485
Negative
Total 5000 995,000 1,000,000
CO
3.2 Bayes Theroem
Let A and B be two events. Bayes theorem gives a way to relate P( A| B) and P( B| A).
P( A B)
P( B| A) =
P( A)
P( A| B) P( B)
=
ON P( A)
P( A| B) P( B)
=
P( A B) + P( A Bc )
P( A| B) P( B)
=
P( A| B) P( B) + P( A| Bc ) P( Bc )
TI
P( A| Bj ) P( Bj )
P( Bj | A) =
iN=1 P( A| Bi ) P( Bi )
Problems
SP
(Ans: .3623)
Solution outline:
15
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 3: C ONDITIONAL P ROBABILITY, B AYES T HEOREM ... Technical Note
Let A, B and C denote the events that the bolt is produced by machine A, B and
C repectively. Let event D= the bolt is defective
PY
P( A) = .25, P( B) = .35, P(C ) = .4, P( D | A) = .05, P( D | B) = .04, P( D |C ) = .02
P( D | A) P( A)
P( A| D ) =
P( D )
CO
P( D ) = P( D | A) P( A) + P( D | B) P( B) + P( D |C ) P(C )
2. According to a research study, the incidence rate of HIV in India is .4% for certain
section of the population. A Clinical test in India is 95 % accurate in detecting
HIV. i.e. If there is HIV, it will correctly detect it 95% of times. If there is no HIV, it
will again be correct in 95% of cases. A person from this section of the population
ON
undergoes a test and the test says he has HIV. What is the probability that he
really has the disease.
(a) 95% (b) 50% (c) Cannot be determined (d) .07.
(Hint: Let A = has disease, B= test is positive, what are P(A|B), P(B|A) ?)
3. A second independent test that has similar accuracy also comes out positive.
TI
4. An industrial raw material is graded and classified into the three categories 1-
Super quality, 2-Medium quality and 3-Low quality. The grading is done visually
and there is a high probability of misclassification. The misclassification however
is limited to neighboring categories. Category 1 may be misclassified as 2. Cate-
gory 2 may be misclassified as either 1 or 3. Category 3 may be misclassified as 2.
SP
16
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 3: C ONDITIONAL P ROBABILITY, B AYES T HEOREM ... Technical Note
a)
P(C1 | B1 ) =
PY
P(C2 | B2 ) =
P(C3 | B3 ) =
b)
P(C1 ) = P( B1 ) + P( B2 ) + P( B3 )
CO
P(C2 ) = P( B1 ) + P( B2 ) + P( B3 )
P(C3 ) = P( B1 ) + P( B2 ) + P( B3 )
P( B1 ) =
P( B2 ) =
ON
P( B3 ) =
d). Suppose p = 1/3, P(C1 ) = 6/10, P(C2 ) = 1/3, P(C3 ) = 1/15. Find P( B1 ),P( B2 ), P( B3 ).
5. In a factory, there are three machines 1, 2, 3, producing 50%, 30%, 20% respec-
TI
tively of the total output. Out of the items produced by machine 2, four percent
are defectives. The corresponding figure for machine 3 is 6%. The following is
known: "If an item is drawn at random from the production line and found to
EC
be defective then the conditional probability for this item to be produced by ma-
chine 1 is 0.50". What is the proportion of defective items among those produced
by machine 1?
(Hint: Use tree diagram with three machines representing three branches emanat-
ing from the first node, then from the end of each machine branch two branches
SP
B and C are 20% and 30% respectively. All items passing the quality control test
are directly acceptable. On the other hand, items failing in the quality control
test are further processed and thus 40%, 50% and 60% of them turn out to be
marginally acceptable, depending on whether they came from machines A, B and
17
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 3: C ONDITIONAL P ROBABILITY, B AYES T HEOREM ... Technical Note
C respectively, e.g., out of the items, that are produced by machine A and that fail
in the quality control test, 40% eventually turn out to be marginally acceptable,
and so on.
PY
(a) Find the probability that a randomly chosen item from the production process
is found to be directly acceptable.
(b) Find the probability that a randomly chosen item from the production process
turns out to be marginally acceptable.
CO
(c) Given that a randomly chosen item from the production process has failed in
the quality control test, what is the conditional probability that it turns out to be
marginally acceptable?
(d) Given that a randomly chosen item from the production process has turned
out to be marginally acceptable, what is the conditional probability that it was
produced by machine A?
ON
(e) Given that a randomly chosen item was not produced by machine B, what is
the conditional probability that it turns out to be marginally acceptable?
(Ans: (a) 0.83, (b) 0.086, (c) 0.506, (d) 0.232, (e) 0.08)
TI
EC
SP
IN
18
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 3: C ONDITIONAL P ROBABILITY, B AYES T HEOREM ... Technical Note
PY
If P( B) > 0, this is equivalent to saying P( A| B) = P( A).
i.e. knowing that event B has occurred does not alter the likelihood of event A
and vice-versa.
CO
Exercises:
1. A 6- faced fair die is tossed and the outcome is noted. Are the events A=outcome
is divisible by 2 and B=outcome is divisible by 3 independent?
2. Show that if A and B are independent then so are (a) A and Bc (b) Ac and B (c)Ac
and Bc . ON
Independence of Three Events: Events A, B and C are said to be (mutually) indepen-
dent if all of the following hold:
b. P( A B C ) = P( A) P( B) P(C ) .
Note: It is important to note that both the above (a) and (b) in the definition
need to be verified for independence of three events. i.e, Just (a) does not ensure
(b) holds and also just (b) does not ensure (a) holds. Examples for both these
situations are covered in the problems.
IN
19
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 3: C ONDITIONAL P ROBABILITY, B AYES T HEOREM ... Technical Note
4. Exercise: Two 6 faced dice are thrown. Let event A= Die 1 shows 6. event, B=
Die 2 shows six, C= Both dice show the same face. Show that A, B, C are pairwise
independent but not mutually independent.
PY
Below is an example to show that P( A B C ) = P( A) P( B) P(C ) does not
imply Pairwise independence.
Exercise: Consider families with 3 children. Let B denote boy and G denote girl.
Let sample space be S= BBB, BBG, BGB, BGG, GBB. GBG, GGB, GGG. Assume
CO
that all 8 possiblities for the three children in order have equal probabilities.
dent if P( Ai1 Ai2 Aik ) = P( Ai1 ) P( Ai2 ) P( Aik ) for all 1 < k N and all
k-tuples (i1 , i2 , , ik ) of {1, 2, , N }.
P ( A 1 A 2 A 3 A N ) = P ( A 1 ) P ( A 2 | A 1 ) P ( A 3 | A 1 A 2 ) P ( A N | A 1 A 2 A N 1 )
20
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 3: C ONDITIONAL P ROBABILITY, B AYES T HEOREM ... Technical Note
6. Exercise: An urn has 3 black balls and 5 white balls. Each time I draw a ball. If
it is black, I add 1 black ball and 2 white balls. If it is white, I add 2 black balls
and 1 white ball. In four successive draws, what is the probabiity of getting a
PY
Black-White-Black-White in four successive draws?.
(Hint: Let Bi denote the event ith draw results in Black ball and Wj the event that
jth draw is white. Then P( B1 W2 B3 W4 ) = P( B1 ) P(W2 | B1 ) P( B3 | B1 W2 ) P(W4 | B1 W2 B3 )
Ans: 0.06016).
CO
ON
TI
EC
SP
IN
21
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 3: C ONDITIONAL P ROBABILITY, B AYES T HEOREM ... Technical Note
PY
CO
ON
TI
EC
SP
IN
22
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 4: R ANDOM VARIABLE AND P ROBABILITY D ISTRIBUTION Technical Note
PY
C HAPTER 4
CO
Distribution
There are 4 possible values for number of heads depending on the outcomes in
TI
the sample space, viz {0, 1, 2, 3}. The corresponding probabilities for these values
to occur are determined as follows:
{ TTT }) = 1/8
P( Numbero f heads = 0) = P(({
EC
A Random Variable (X) is a function that maps the sample space to the real line, (i.e. X :
S R)
R). In the example above, ""X=Number of Heads" is a random variable.
A Discrete Random Variable can only take countably many values. Note that countably
many does not necessarily mean finite. e.g. the set {1/7, 1/8, 1/9,3} has countably many elements
because it has finitely many elements. However, the set {0, 1, 2, 3, , } is also countable
IN
although it has infinitely many elements. The set of rational numbers (i.e. ratios of integers) is
also countable. . To be precise, a set of possible values is said to be "countable" if one can define
a one-one map from the set to the set of natural numbers
numbers.
23
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 4: R ANDOM VARIABLE AND P ROBABILITY D ISTRIBUTION Technical Note
PY
variable.
CO
Example 3 : X = Waiting time for a machine to break down after repair.
Example 4: X=CAT
=CAT score of a randomly chosen first year IIMA student.
Example 5:
1, if treatment for disease is effective
X=
0, if treatment is not effective
ON
As in the examples above, a number of phenomenon involving uncertainty can be for-
mulated in terms of random variables.
questions:
To understand any random variable, need to ask two basic questions
....To answer the latter question, one needs to be clear about the assumptions
EC
being made.
Possible values X = 0 or 1 or 2 or 3
SP
.... and with probabilites 1/8, 3/8, 3/8 and 1/8 respectively.
.... This is based on the asumptions that (i) coin is fair (ii) outcomes from different
tosses are independent.
IN
Probability Mass Function ( f ()) for a discrete random variable X taking values X =
{ a1 , a2 , , ak , } is given by
f ( a) = P( X = a), where a X .
24
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 4: R ANDOM VARIABLE AND P ROBABILITY D ISTRIBUTION Technical Note
Sometimes, the above may just be referred as the distribution of the random variable.
We will see later that for continuous random variables, the p.m.f is not meaningful.
PY
(p.d.f): A
However, an analogous concept is that of Probability Density Function (p.d.f):
function f () is said to be the p.d.f of a continuous random variable X if P( a < X b) =
b
a
f ( x )dx. Note that the cumulative distribution function is meaningful for both dis-
crete and continuous random variables.
CO
Important properties of c.d.f F ()
(i) F () = 0, F () = 1
1. Exercise
ON
Classify the following variables into discrete and continuous random variables.
b. Heart rate (Numer of beats per minute) of a student who has just seen the first
quiz question.
TI
2. Exercise: Let a random experiment be the cast of a pair of unbiased six-sided dice
and let X be equal to the smaller of the outcomes if they are different and common
SP
values if they are equal. Find the p.m.f. of X by explicitly stating the assumptions
you are making.
Solution outline
25
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 4: R ANDOM VARIABLE AND P ROBABILITY D ISTRIBUTION Technical Note
A coin is independently tossed repeatedly. Let X=the number of tosses until and
including the toss in which a head appears for the first time. Let P( H ) = p.
PY
a. What is the distribution of X?.
CO
Geometric Distribution is a "discrete waiting time" distribution. It is the distribution
of waiting time until some event happens, where time is measured in integer units.
An analogue of this in continuous time is the "Exponential Distribution" which we will
study later. The property under part b of previous question is known as Memoryless
property.. Knowing that the event has not happened until r time units does not change
the probability that you have to wait for k more time units until the event happens.
ON
In other words, the past is irrelevant to determine future waiting time. What we have
seen above is that Geometric distribution satisfies this property. What is even more
interesting is that Geometric is the only discrete distribution on {0, 1, 2, 3, , }
with this property.
A random variable by definition takes random values (as governed by p.m.f). For
example, I cannot say what exactly will be the stock price of a company tomorrow.
EC
However, with some effort it is easier for me to comment on the distribution of stock
price. (i.e. what values and with what probability). Since I cannot precisely say what
exact value the random variable will take, it helps to define some summary measures
of its distribution. Given below are some summary measures.
26
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 4: R ANDOM VARIABLE AND P ROBABILITY D ISTRIBUTION Technical Note
PY
(ii) For any two random variables X and Y, E[ X + Y ] = E[ X ] + E[Y ].
(iii) Let X take values { a1 , a2 , , ak , } and let g() be a function on the real
line. Then Y = g( X ) is another random variable.
CO
y:P(Y =y)>0 i 1
= E[ X 2 ] E[2X.E[ X ]] + ( E[ X ])2
= E[ X 2 ] 2E[ X ].E[ X ]] + ( E[ X ])2
= E[ X 2 ] ( E[ X ])2
(SD(X)): SD ( X ) =
Standard Deviation of X (SD(X)) V (X) .
SP
This is also a measure of spread but has the same units as X, unlike variance
which is of squared units.
1. Exercise
variable
:
1, with probability p
X= .
0, with probability (1 p)
27
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 4: R ANDOM VARIABLE AND P ROBABILITY D ISTRIBUTION Technical Note
2. Exercise
PY
Solution idea:
observe
k =1 (1 p )
k 1 p = 1
So, k 1 p E[ X ]
k =1 ( 1 p ) = p . Now differentiate both sides, then L.H.S will give p .
(1 p )
(Answer: E[ X ] = 1p , V ( X ) = p2
)
CO
ON
TI
EC
SP
IN
28
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 4: R ANDOM VARIABLE AND P ROBABILITY D ISTRIBUTION Technical Note
3. Exercise
PY
E will occur within a year with probability p,, what should it charge the customer
in order that its expected profit will be 10% of A?.
CO
institution may want to know the "value at risk" (VAR), i.e a value such that the
probability of the losses from the portfolio exceeding this number is very small
e.g. 0.05. In probablistic terms, "loss" is the random variable and VAR in the
example is the 95th percentile or quantile of the distribution of loss. To be more
precise,
Q is said to be the 100pth percentile (or quantile) of the distribution of X if P( X <=
Q) = p.
ON
TI
EC
SP
IN
29
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 4: R ANDOM VARIABLE AND P ROBABILITY D ISTRIBUTION Technical Note
PY
4. Exercise
a. P( X > 0)
CO
b. P( X is even)
c. P(1 X 8)
d. P( X = 3| X 0)
e. P( X 3| X > 0)
f. E[ X ]
ON
g. V [ X ]
( Ans: a) .55, b) .3, c) .55 d) .2222 e) .4545 f) 1 g) 6.5 h) -1 (actually any number
TI
5. Exercise
EC
Suppose that a school has 20 classes: 16 with 25 students in each, three with 100
students in each and one with 300 students for a total of 1000 students.
b. Suppose a student is picked at random from the 1000 students. Let X= size of
SP
c. What is E[X] ?
d. Is it surprising why a. and c. are not equal. Can you define a random variable
Y such that E[Y ] will give the answer in a.
IN
30
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 5: S OME P OPULAR D ISCRETE D ISTRIBUTIONS Technical Note
PY
C HAPTER 5
CO
Distributions
p.. Assume that the quality of items are independent of each other. Let random
variable X = Number of defective items produced in the day.
Solution Outline:
First basic question: List possible values of X....It i s {0, 1, 2, ...., n}.
SP
What is P(X=0) ?
What is P(X=n) ?
31
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 5: S OME P OPULAR D ISCRETE D ISTRIBUTIONS Technical Note
Note that the number of outcomes with X = k is same as the number of ways of
arranging n objects, k of which are of one kind (i.e. H) and n k are of another
n!
kind (i.e. T) , i.e. k!(nk )!
= nCk .
PY
Also, note that each outcome with k heads has probability pk (1 p)nk .
CO
n n
n! ( n 1) !
k. k!(n k)! pk (1 p)nk = np (k 1)!(n k)! pk1 (1 p)nk = simplify .... = np
k =0 k =1
b. What is the probability that the target was hit at least twice given that it had
been hit at least once?
b. If the airline must return Rs. 4000 plus a penalty of Rs. 5000 for all who show
up but cannot be accomodated, what is the expected penalty the airline will pay.
IN
(Hint: Create a new r.v. Y= penalty paid by airline. Express Y in terms of X. Since
we know distribution of X, find the distribution of Y. i.e. Identify possible values
of Y and its probability. Ans:(a) .5640 (b) Rs. 4275.)
32
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 5: S OME P OPULAR D ISCRETE D ISTRIBUTIONS Technical Note
1. Exercise: An urn contains N balls out of which R are red and the rest white
in color. A sample of n balls is chosen randomly without replacement. Let the
PY
random variable X = Number of red balls in the sample. Show that the p.m.f of
X is given by
RCr ( N R)Cnr
P( X = r ) = , r {0, 1, 2, ..., n}
NCn
CO
X in the above exercise is said to follow a "Hyper-geometric" distribution.
Note that for the formula above makes when we have the following conditions:
min( R, n) r and ( N R) n r
For any r,, if either of these conditions do not hold, the probability is taken to be
0. ON
2. Exercise: A random sample of size 3 is drawn without replacement from a lot of
size 10, which contains 4 defective items. What is the probability that at least 1 of
the 3 items drawn are defective?.
The expectation and variance for the above hypergeometric distribution are as
follows. (The derivation is a bit complex and the reader is referred to example 8j
TI
(Ans. 0.46. Hint: Define event A=lot has 4 defectives, random variable X= Num-
SP
Here is an exercises that will help understand one key assumption behind Hy-
IN
33
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 5: S OME P OPULAR D ISCRETE D ISTRIBUTIONS Technical Note
PY
5. Exercise: (If you like a bit of algebra...derive this !, else focus on understanding
the result). Suppose X follows Hyper-geometric distribution, i.e.,
RCr ( N R)CKr
P( X = r ) = , r {0, 1, 2, ..., Min(K, R)}
)}
NCK
Show that as N , if R
N p, then P( X = r ) = k Cr pr (1 p)kr .
CO
Meaning of the above result: When the population is large, sampling with or
without replacement will not make much of a difference. Hence, Hypergeometric
will be close to Binomial.
Exercise
EC
3. You are interested in estimating the proportion of missing books in a library. You
take the catalogue and design two sampling plans
34
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 5: S OME P OPULAR D ISCRETE D ISTRIBUTIONS Technical Note
a) Draw a random sample of size 100, of names from the catalogue. Then look
them up in the library to see how many of them are missing.
b) Keep drawing a random item sequentially from the catalogue and check whether
PY
the item is missing or not. Once you encounter two missing books, you stop the
sampling.
(i) what is the expected number of books you will check in the two schemes?
CO
(ii)What is the probability that you encounter no missing item in the first sam-
pling scheme?
(iii) What is the probability that you will sample more than 500 items in the sec-
ond sampling scheme? ON
(Ans: (i) (100,200) (ii) 0.366 (iii) .0397 . Caution: If using Excel check the syntax of
the command carefully to make sure you supply correct inputs)
r r (1 p )
E[ X ] = V (X) =
p p2
EC
e k
P( X = k) = , for k {0, 1, 2, 3, , }.
k!
Typically, such a distribution is found useful in modelling the number of events hap-
pening in a fixed time interval. e.g. the number of radio active particles that decayed
IN
35
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 5: S OME P OPULAR D ISCRETE D ISTRIBUTIONS Technical Note
An Important Note about the Popular Distributions. So far, all the distributions that
we have studied can be derived just based on some assumptions made about the exper-
iment or underlying process. In fact, this is what makes these distributions important
PY
and useful. For example, our knowledge of the insurance context can help us decide
whether assuming that claim from different policies are independent of each other and
whether there can be atmost 1 claim from each policy and if the portfolio of policies
I am looking at are of similar risk. If these business context conditions are met then
Binomial naturally follows as the distribution for number of claims. This is true even
CO
in the case of Poisson distribution. We will just state the underlying assumptions for
Poisson. The derivation of why that leads to the Poisson distribution is beyond the
scope of this note.
Suppose X is a random variable that is the number of events happening in a time interval
of length 1. Then, it will follow Poisson() if the following conditions are met:
(ii) In a very small interval t, P(occurence within the interval (t, t + t)) t.
(iii) Occurence or non-occurence of events in any two disjoint time intervals are indepen-
dent of each other.
TI
Show that E[ X ] = V ( X ) =
Note that the expectation and variance of a Poisson random variable are equal.
Such properties help while doing statistical modeling. For example, when ana-
lyzing a dataset on event occurence, you may observe that the mean and variance
in the data are close. That may be an indication that the underlying process may
SP
be poisson.
2. Exercise: An executive makes an average five telephone calls per hour at a cost
of Rs. 2 per call. Determine the probability that in any hour the cost of calls (a)
exceeds Rs 6 and (b) remains less than Rs. 10.
IN
36
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 5: S OME P OPULAR D ISCRETE D ISTRIBUTIONS Technical Note
average number of monthly break downs is 1. Find the probability that this com-
puter will work for 3 months (a) without any break down (b) with exactly one
break down.
PY
(Ans: a) .0497 b) .1494 )
CO
5. Color blindness appears in 1% of a population. How large a random sample
(with replacement) should one draw from the population if the probability of it
containing atleast 1 color blind person is 95% or more. Use both the binomial and
Poisson to derive the required sample sizes.
37
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 5: S OME P OPULAR D ISCRETE D ISTRIBUTIONS Technical Note
PY
ters
(1 p )
Geometric p k {1, 2, , } (1 p ) k 1 p 1
p p2
Bernoulli p k {0, 1} p k ( 1 p ) 1 k p p (1 p )
Binomial (n, p) k {0, 1, , n} nCk pk (1 p)nk np np(1 p)
RCk ( N R)Cnk
Hyper geo- ( N, R k {0, 1, , n} NCn
nR
N [ (nN
1)( R1)
1 +1
CO
nR nR
metric , n) min( R, n) k N ] N
n ( N R)
( k 1) ! r (1 p )
Negative (r, p) k {r, r + 1, , } (r 1)!(kr )!
r
p p2
Binomial pr (1 p ) k r
e k
Poisson k {0, 1, , } k!
ON
Let us recap the various discrete distributions we learnt through the following
exercise.
6. Exercise: Determine the discrete distribution better matches the random variable
described below and write down the p.m.f. clearly where possible.
TI
b. An MBA grad keeps appearing for recruitment interviews until he gets se-
EC
lected into two jobs. Then he selects the best of two. Probability of his getting
selected to any job is 0.6 and this event is independent of other jobs. X= number
of interviews he takes.
c. Of five applicants for a job, two will be selected. Although all are equally
qualified, only three of the applicants have the ability to fulfill the expectations
SP
of the company. Suppose two selections are made at random and X=number of
qualified applicants selected who can fulfill the companys expectation.
voters out of one million total voters, to determine how many people support a
particular party. X= number of people in the sample who support.
38
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER Technical Note
6: J OINT D ISTRIBUTIONS OF MORE THAN ONE R ANDOM VARIABLE
PY
C HAPTER 6
CO
Random Variable
Question: Is P( X1 = n1 , X2 = n2 ) = P( X1 = n1 ) P( X2 = n2 )?
In principle, we can talk about joint distributions of a many random variables, some
or all of which may be continuous. For continuous random variables, it does not make
sense to ask the probability of particular value. In that case,
IN
39
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER Technical Note
6: J OINT D ISTRIBUTIONS OF MORE THAN ONE R ANDOM VARIABLE
PY
i. Then,
n!
P ( X1 = n 1 , X2 = n 2 , , X k = n k ) = pn1 p2n2 pnk k
n1 ! n2 ! n k ! 1
CO
This is a generalization of the Binomial distribution which can be thought of as
resulting from distributing n distinguishable balls into 2 cells (H and T). The first
term in the product on R.H.S is the number of ways to allocate the occurrence of
faces 1, 2, ....,k into n places so that n1 of type 1, n2 of type 2,.. etc occur. Each such
allocation has probability p1n1 p2n2 pnk k . ON
1. Exercise: Suppose a 6 faced die has 2 faces numbered 1, 3 faces numbered 2 and
1 face numbered 3. Suppose the die is thrown independently 10 times and in each
throw each face has an equal chance of showing up.
P( X = a) = P(X = a, Y = bj )
j 1
SP
Note that this is not new. The event { X = a} is the union of disjoint events
{ X = aa,, Y = b j }, j 1 and the equality just follows by applying the axioms
of probability. Whne we are studying jointly distrbuted random variables, the
distribution of individual random variables are referred to as "Marginal distribu-
tions".
IN
40
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER Technical Note
6: J OINT D ISTRIBUTIONS OF MORE THAN ONE R ANDOM VARIABLE
Table 6.1: : Joint Distribution (Each entry in the cells (i.e. apart from row and column
headings) is the probability that X takes the vaue in the column heading
and Y takes the value in the row heading
X values
PY
Y Values 1 0 2 6
1 1 1 1
-2 9 27 27 9
2 1 1
1 9 0 9 9
1 4
3 0 0 9 27
CO
2. Exercise: X and Y are random variables with the joint distribution as in table 6.1.
P ( X1 = a 1 , X2 = a 2 , , X k = a k ) = P ( X1 = a 1 ) P ( X2 = a 2 ) P ( X k = a k )
SP
P ( X1 a 1 , X2 a 2 , , X k a k ) = P ( X1 a 1 ) P ( X2 a 2 ) P ( X k a k )
IN
41
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER Technical Note
6: J OINT D ISTRIBUTIONS OF MORE THAN ONE R ANDOM VARIABLE
PY
d. Find P(Y is even)
f. Find P( X > 0, Y 0)
CO
h. Find V [ X ] and V [Y ].
X,, Y ) = E[ XY ] E[ X ] E[Y ]
Cov( X
follows
Cov( X, Y )
X, Y ) =
Corr ( X
V ( X ) V (Y )
You will learn more about correlations when you study "Regression Modeling".
i. Find Cov( X
X,, Y )
SP
42
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER Technical Note
6: J OINT D ISTRIBUTIONS OF MORE THAN ONE R ANDOM VARIABLE
PY
l. Find E[Y | X = j] for j = 2, 1, 3.
CO
More generally the following important equalities hold. These will be stated
without proof.
E[ X ] = E [ E[ X |Y ]]
V [ X ] = V ( E[ X |Y ]])) + E (V [ X |Y ]])
ON
3. Exercise: Let X Poi (2) and Y Poi (3). Let X and Y be independent.
5. Exercise X and Y are random variables with the joint distribution as in table 6.2.
43
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER Technical Note
6: J OINT D ISTRIBUTIONS OF MORE THAN ONE R ANDOM VARIABLE
PY
-1 6 6 0
1 1
0 0 6 6
1 1
1 6 6 0
c. What is cov(X,Y)?
CO
d. Based on c, can we conclude that X and Y are independent?
ON
TI
EC
SP
IN
44
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 7: S UMS OF RANDOM VARIABLES Technical Note
PY
C HAPTER 7
CO
In many practical situations it is common to encounter sums of random variables. Here
are a few examples
Claim from an insurance policy is a random variable. Total claims from an insur-
ON
ance portfolio is sum of many random variables.
Daily sales of a company is a random variable. Yearly sales is the sum of daily
sales over many days.
The total return in rupees from one stock in an investment portfolio is a random
variable. The total return from the portfolio is the sum of returns from individual
TI
stocks.
(i)E[in=1 Xi ] = in=1 E[ Xi ]
V ( X1 + X2 + + X n ) = V ( X1 ) + V ( X2 ) + + V ( X n )
45
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 7: S UMS OF RANDOM VARIABLES Technical Note
PY
This result is extremely useful in practice. It is saying that whetever be the vari-
ability of the individual random variables, the variabilty of average of many such
random variables will be lesser and will decrease as n increases.
e.g. The average return from a portfolio of stocks is more predictable than return
CO
from each indiviual stock.
e.g. The number of insurance claims from a policy is less predictable or less cer-
tainly known than the number of claims from an insurance portfolio.
ON
1. Exercise: In a constituency, there are 2 political candidates A and B standing for
elections. For an exit poll, a researcher wants to interview n people and estimate
the percentage of people in support of A. How many people should she interview
so that so that the standard deviation of the estimated proportion is not more than
.01?
TI
(i) X1 , X2 , , Xk are independent Bin(n1 , p), Binomial (n2 , p), , Bin(nk , p) re-
spectively, then X1 + X2 + + Xk Bin(n1 + n2 + + nk , p)
EC
CAUTION: X1 Bin(n
n,, p1 ) , X2 Bin(n, p2 ), where p1 = p2 , the X1 + X2 is
Binomial.
NOT Binomial.
SP
Why?
What is E[ X ] and V [ X ]?
46
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 7: S UMS OF RANDOM VARIABLES Technical Note
PY
a. Atleast 2 of the 10 never eat breakfast.
b. Number of women who eat breakfast is at least as much as the number of men
who eat breakfast.
c. What would have been the answer to (a) if both the percentage of men and
CO
women who did not eat break fast were equal to 20%?.
(Ans. a) Hint: X= number of men not eating, Y= number of women not eating.
P( X + Y 0)=1 P( X + Y = 0) P( X + Y = 1). 1 0.755 0.85 5 0.755
0.2 0.84 5 0.25 0.754 0.85 . b) Hint: P( X Y ) = 5k=0 P( X k, Y = k ) =
5k=0 P( X k ) P(Y = k ) = 5k=0 5r=k P( X = r ) P(Y = k ) and use X Bin(5, .25)
and Y Bin(5, .2). c) P( Z 10) where Z Bin(10, .2).
ON
TI
EC
SP
IN
47
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 7: S UMS OF RANDOM VARIABLES Technical Note
PY
CO
ON
TI
EC
SP
IN
48
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 8: C ONTINUOUS R ANDOM VARIABLES Technical Note
PY
C HAPTER 8
CO
A good way to understand idea behind continuous random variable is by drawing a
parallel with discrete random variables. see table 8.1. We will mainly study uniform,
exponential, normal and gamma distributions. A quick summary is given in table 8.2.
Recall that for any random variable be it discrete or continuous, the c.d.f. F () is right
ON
continuous. For a contnuous random variable we would need it be to continuous from
both left and right.
a. Find c.d.f of X
X..
EC
b. Find c.d.f of Y = X 2
c. Find p.d.f of Y.
Y.
Solution:
Yy F ( X ) y X F 1 (y)}
So, P(Y y) = P( X F 1 (y)) = F ( F 1 (y)) = y
Hence Y = F ( X ) U [0, 1].
49
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 8: C ONTINUOUS R ANDOM VARIABLES Technical Note
PY
e.g. values X {0, 1, 2, , } OR X X (0, ), X (
(,, ),
{ 23 , 53 } etc. X [2, 7], X [2, 7] [10, 11]
In general X { a1 , a2 , , ak , ..} XSR
probability mass func- P( X = ak ) not applicable
tion (p.m.f)
CO
cumulative distribu- F(x) = P( X F(x) = P( X x) =
x
tion function (c.d.f) x )=k:ak x P( X = ak ) f ( x )dx
d
probability density not applicable f (x) = dx F ( x )
function (p.d.f)
Probability of a set is adding probabilities of indi- computing the area under the
obtained by vidual values density curve.
Example Distributions Geometric, Binomial, Hyper- Uniform, Exponential, Nor-
geometric,
ON
Negative Bino- mal, Gamma.
mial, Poisson
E[ X ] k ak P( X = ak ) x f ( x )dx
V [X] k ( ak E[ X ])2 P( X = ak ) ( x E[ X ])2 f ( x )dx
TI
Y y F ( X ) y X x0 = sup{ x : F ( x ) = y}
EC
b. Suppose you know how to simulate a random number from U [0, 1], how
F?
would you simulate from F?
-Simulate first a number u from U [0, 1] and then compute x = F 1 (u). The
SP
above result says that such a generated x will be a random number from the
F.
distribution F.
This is perhaps the simplest of the continuous distributions. For example, suppose a
bank knows that the recovery on a loan it has given to a customer, in case there is a
default, is anywhere between 60% to 80% of the outstanding loan amount. Suppose,
50
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 8: C ONTINUOUS R ANDOM VARIABLES Technical Note
PY
variance=2 )
, variance=
1
x ex , x > 0,
,
( x )2
p.d.f f (x) = f (x) = f (x) = 1 e 2
0, otherwise 0, x 0 2
f (x) < x <
0, x <
0, x < 0
CO
c.d.f F ( x ) = x
, x
F(x) = No closed form expres-
1 exx , x > 0
F(x)
sion. Need to use tables
1, x >
or computer
+ 1
Mean 2
E[ X ]
( )2 1
Variance 12 ON 2
2
in the absence of any other information, the bank believes that these values are equally
likely. Then, we are essentially saying that X = Recovery rate on the loan follows a
uniform distribution on the interval [.6, .8], i.e. X U [.6, .8]. In general, uniform dis-
tribution can be on any interval and so in general we have U [, ]. The mean, variance,
p.d.f, c.d.f etc. are summarized in table 8.2.
TI
1. Exercise A job takes anywhere between 0 and 1 hour to complete. Assume that
the time to completion is uniformly distributed.
EC
d. Given that a job took less than 20 minutes to complete, what is the expected
SP
that is uniformly distributed between 7 and 8 A.M, and then gets on the first train
that arrives what is the probability that the passenger travels to A.
(Ans 2/3)
51
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 8: C ONTINUOUS R ANDOM VARIABLES Technical Note
Recall the Geometric distribution, which was a discrete waiting time distribution. Ex-
ponential is a continuous time waiting time distribution. It is the only continuous
PY
distributon on (0, ) that satisfies the "Memoryless property" (i.e. P( X > t + s| X >
t) = P( X > t)). Basicaly, the fact that an event has not occured until some time t has
no bearing the probability of it happening within a future time s.. The past is irrelavant
while determining future waiting times. See table 8.2 for details on p.d.f, c.d.f, etc...
CO
1. Exercise: If X is exponentially distributed show that P( X > t + s| X > t) = P( X >
s)
2. Exercise: Jones figures that the total number of thousands of miles that an auto
can be driven before it would need to be junked is exponential with mean param-
eter 1
20 .
ON
Smith has a used car that he claims has been driven only 10,000 miles. If
Jones purchases the car, what is the probability that she would get at least 20,000
additional miles out of it?
The Normal distribution is perhaps the most fascinating of all distributions. It makes
an appearance in various interesting examples
EC
The buzz-noise that we often hear while tuning old radios is referred to as "white
noise" and the various noise levels that are part of the buzz would approximately
be normally distributed.
IN
While it is interesting that the normal distribution appeared in various situation, there
is no formal or mathematical justification for why that should be the case. In fact, there
is no reason why all data should be normally distributed. However, although all data
52
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 8: C ONTINUOUS R ANDOM VARIABLES Technical Note
need not be normally distributed, the most startling mathematical fact is that averages
of data will be normally distributed. This celebrated mathematical result is known as
the Central Limit Theorem or in short as CLT.. The formal statement of CLT is given
PY
later.
CO
N (0, 1) is known as the Standard Normal Distribution
a. P( X 2).
b. P( X 3) ON
c. P(3 < X 2)
d. 97.5 percentile of X.
e. 95 percentile of X.
2. Exercise: GMAT scores for a group of students are approximately normally dis-
TI
tributed with mean 580 and SD=55. All students above a score of 650 are admitted
to a business school.
school?
b. What percentage of the admitted students are expected to have a score over
700?.
bb,, a2 2 ).
53
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 8: C ONTINUOUS R ANDOM VARIABLES Technical Note
PY
Recall that if X and Y are independent, then Cov( X, Y ) = 0. However, we know
that if Cov( X, Y ) = 0 that does not necessarily mean that X and Y are indepen-
dent. However, an interesting exception is when ( X, Y ) follow a bi-variate nor-
mal distribution. In that case independence and zero covariance turn out to be
CO
equivalent. Discussion if bi-variate normal is beyond the scope of this note.
b.Suppose the contribution per bearing of acceptable size is 10 paise. Those ex-
ON
ceeding .504 cms can be reworked at a cost of 6 paise per piece (resulting in a net
contribution of 4 paise) and those below .496 have to be scrapped resulting in a
net loss of 20 paise per piece. What is the expected net contribution from a batch
of 10000 pieces.
4. A project has four phases viz. 1,2,3,4. A phase cannot start until the previous
phase is completed. The time to completion for each phase is believed to be nor-
mally distributed with means 6, 12, 4 and 8 weeks respectively and standard de-
EC
viations 1,3,1 and 2 weeks respectively. Completion times of the different stages
are independent of each other.
a.What is the expected total time and SD of total time for completion of the
project?
SP
b. What is the probability that phase 3 can be started no later than 20 weeks from
start.
54
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 8: C ONTINUOUS R ANDOM VARIABLES Technical Note
PY
n
2
Then for large n, X approximately follows a Normal distribution with mean and variance n.
Stated differently,
n( X )
N (0, 1) as n .
More precisely,
CO
( X )
lim P n t = (t), where () is the c.d.f of N (0, 1)
n
Caution !: Many people make the common mistake of interpreting CLT wrongly by
thinking that all data should be approximately normally distributed. That is incorrect.
The result only says that if we were to take averages of data points, the average which
is also a random variable is approximately normal.
Step 2. Repeat Step 1 1000 times and record the mean each time.
TI
Step
4. Super-impose the p.d.f of Normal distribution with mean 50 and variance
100
12 .
EC
Repeat the above exercise with 100 samples instead of 50. Now, what do you
observe?
2. If X Bin(n
n,, p) then for large n, X is approximately normally distributed with mean
SP
55
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
C HAPTER 8: C ONTINUOUS R ANDOM VARIABLES Technical Note
b. Per CLT, what is the approximate distribution of the total number of claims
from the portfolio?
PY
4. You manage a sales organization consisting of 100 people. Each sales person is
capable of selling on an average 4 items per month. Based on your experience
you have observed that the standard deviation of sales made is 1.
CO
b. What is the probability that the total sales in the month exceeds 410 items?
(state your assumptions clearly).
410
4
(Ans: b) P Z > 1001
)
100
ON
TI
EC
SP
IN
56
This document is authorized for personal use only by Shirsendu Nandi, of Indian Institute of Management Rohtak till 14th June ,2018. It shall not be reproduced or distributed without express written
permission from Indian Institute of Management, Ahmedabad.
IIMA/QM0275TEC
R EFERENCES Technical Note
PY
References
CO
Ross, Sheldon. (2007). Introduction to Probability Models,, Ninth Ed, Academic Press El-
sevier.
Ross, Sheldon. (2014). A First Course in Probability,, Ninth Ed, Pearson Education.
Feller, William. (1993). An Introduction to Probability Theory and Its Applications -Volume
I,, Ninth Wiley Eastern Reprint, Wiley Eastern Limited.
ON
Feller, William. (1991). An Introduction to Probability Theory and Its Applications -Volume
II,, Sixth Wiley Eastern Reprint, Wiley Eastern Limited.
Hogg, Robert V. and Tanis, Elliot A. and Rao, Jagan Mohan. (2006). Probability and
Statistical Inference,, Seventh Ed, Pearson Eductaion.
57