Академический Документы
Профессиональный Документы
Культура Документы
Winter 2016-2017
Winter 2016-2017
1 / 56
Probability
Conditional Probability
Independence
Winter 2016-2017
2 / 56
Outline
Probability
Conditional Probability
Independence
Winter 2016-2017
3 / 56
The foundation
We want to model (describe, analyze, work with) outcomes that are not
deterministic in nature, or at least not tractably deterministic in nature.
We might call such things random.
We need to be precise about terms, though.
Definition
A set, S, that contains all possible outcomes of a given experiment is
called sample space.
The sample space need not be identically equal to the set of all
possible outcomes (could be larger).
The sample space should be specified large enough to contain the set
of all possible outcomes as a subset.
Statistics and Econometrics (CAU Kiel)
Winter 2016-2017
4 / 56
Some examples
Example (Dice)
If the experiment consists of tossing a die, the sample space contains six
possible outcomes given by S = { q , q q , qqq , qq qq , qqq qq, qqq qqq }.
Example (Traffic deaths)
If the experiment consists of recording the number of traffic deaths in
Germany next year, the sample space would contain all positive integers,
S = {0, 1, 2, ....}.
Example (Light bulbs)
If the experiment consists of observing the length of life of a light bulb,
the sample space would contain all positive real numbers, S = (0, ).
Winter 2016-2017
5 / 56
The sample space S, as all sets, can be classified according to whether the
number of elements in the set are
- finite (discrete sample space), e.g., S = {0, 1, 2, .., 6}
- countably infinite (discrete sample space), e.g., S = N = {0, 1, 2, ...}
- uncountably infinite (continuous sample space), e.g., S = R.
All this possible outcomes cannot happen at the same time.
Winter 2016-2017
6 / 56
Definition
An event, say A, is a subset of the sample space S (including S itself).
Winter 2016-2017
7 / 56
Tossing dice...
The experiment consists of tossing a die and counting the number of dots
facing up. The sample space is defined to be S = {1, 2, ..., 6}. Consider
the following subsets of S:
A1 = {1, 2, 3},
A2 = {2, 4, 6},
A3 = {6}.
A1 A3 =
Winter 2016-2017
8 / 56
Probability
Intuitively, not all events are equally likely to happen. We would like to
quantify the differences.
For each event A in S we want to associate a number between 0 and
1 that will be called the probability of A.
For this purpose we will use an appropriate set function, say P (),
with the set of all events as the domain of P ().
Unfortunately, uncountable sample spaces cause problems
... in the sense that not all its subsets can be assigned probabilities in a
consistent manner.
Solutions to this issue are prepared in the definition of an event:
Every event is a subset of S but not every subset of S is an event!
Winter 2016-2017
9 / 56
Event space
Definition
The set of all events in the sample space S is called the event space Y .
The definitions of events and event space as the set of all events
introduced above do not indicate which subset of S is an event belonging
to the event space Y to which we want to assign probability.
In the following we will use a collection of subsets of S which represents a
sigma algebra in S as our event space Y . A sigma algebra is as defined as
follows:
Winter 2016-2017
10 / 56
Sigma algebras
Definition
A collection of subsets of S is called a sigma algebra, denoted by B, if it
satisfies the following conditions:
(i) B (empty set is an element of B);
(ii) If A B, then A B (B is closed under complementation);
(iii) If A1 , A2 , ... B, then
i=1 Ai B (B is closed under countable
unions).
By using DeMorgans Law we obtain from properties (ii) and (iii) that
Winter 2016-2017
11 / 56
Winter 2016-2017
12 / 56
a, b S,
Winter 2016-2017
13 / 56
Probability
Outline
Probability
Conditional Probability
Independence
Winter 2016-2017
14 / 56
Probability
Getting closer
Winter 2016-2017
15 / 56
Probability
Winter 2016-2017
16 / 56
Probability
Winter 2016-2017
17 / 56
Probability
Dice again
Toss a fair die and count the number of dots facing up.
The sample space with equally likely outcomes is S = {1, 2, ..., 6}.
We have N (S) = 6. Let Ei (i = 1, ..., 6) denote the elementary
events in S.
According to the classical definition we have
P (Ei ) =
1
N (Ei )
= ,
N (S)
6
and P (S) =
N (S)
=1
N (S)
1
N (A)
= .
N (S)
2
Winter 2016-2017
18 / 56
Probability
Shortcomings?
A probability definition which does not suffer from these limitations is the
relative frequency probability which is defined as follows:
Winter 2016-2017
19 / 56
Probability
Relative frequencies
Definition (Relative frequency probability)
Let n be the number of times that an experiment is repeatedly performed
under identical conditions. Let A be an event in the sample space S, and
define nA to be the number of times in n repetitions of the experiment
that the event A occurs. Then the probability of the event A is given by
nA
the limit of the relative frequency nA /n, as P (A) = lim
.
n n
Toss a coin with S = {head, tail} several times:
n (No. of tosses)
100
500
5000
Winter 2016-2017
20 / 56
Probability
Shortcomings
Winter 2016-2017
21 / 56
Probability
2
The subjective probability plays a crucial role in Bayesian statistics and Decision
theory.
Statistics and Econometrics (CAU Kiel)
Winter 2016-2017
22 / 56
Probability
Final comments
Winter 2016-2017
23 / 56
Probability
Winter 2016-2017
24 / 56
Probability
Winter 2016-2017
25 / 56
Probability
Winter 2016-2017
26 / 56
Probability
Dice...
Let S = {1, 2, ..., 6} be the sample space for rolling a fair die and
observing the number of dots facing up. The set function
P (A) = N (A)/6 for
AS
N (ni=1 Ai )
=
=
6
Pn
i=1 N (Ai )
n
X
P (Ai ) (additivity).
i=1
Winter 2016-2017
27 / 56
Probability
2
x=1
xS
|x=0 {z }
geom. series
X
x(n
i=1 Ai )
"
#
x X
n
n
X 1 x
X
1
=
=
P (Ai ) (additivity)
2
2
i=1
i=1
xAi
Winter 2016-2017
28 / 56
Probability
Winter 2016-2017
29 / 56
Probability
Either way, the three axioms imply many properties of the probability
function. See below.
Winter 2016-2017
30 / 56
Outline
Probability
Conditional Probability
Independence
Winter 2016-2017
31 / 56
Theorem (1.2)
P () = 0.
Theorem (1.3)
Let A and B be events in S such that A B. Then P (A) P (B) and
P (B A) = P (B) P (A).
Theorem (1.4)
Winter 2016-2017
32 / 56
Theorem (1.6)
Let A be an event in S. Then P (A) [0, 1].
Winter 2016-2017
33 / 56
Theorem (1.8)
Let A1 , . . . , An be events in S. Then P (ni=1 Ai ) 1
Pn
i=1 P (Ai ).
Winter 2016-2017
34 / 56
Conditional Probability
Outline
Probability
Conditional Probability
Independence
Winter 2016-2017
35 / 56
Conditional Probability
Winter 2016-2017
36 / 56
Conditional Probability
Winter 2016-2017
37 / 56
Conditional Probability
Focus on sub-algebras
Winter 2016-2017
38 / 56
Conditional Probability
In more detail
Since B occurs,
A occurs iff A occurs concurrently with B (that is iff A B occurs).
Hence, P (A | B) P (A B).
The division by P (B) ensures that P (A|B), as defined, represents a
probability function (see theorem).
Winter 2016-2017
39 / 56
Conditional Probability
Theorem (1.10)
Given a probability space {S, Y, P } and an event B for which P (B) 6= 0,
P (A | B) = P (A B)/P (B) defines a probability set function with
domain Y .
Winter 2016-2017
40 / 56
Conditional Probability
Example: Coins
The experiment consists of tossing two fair coins. The sample space is
S = {(H,H), (H,T), (T,H), (T,T)}. The conditional probability of the event
obtaining two heads
A = {(H,H)},
given the first coin toss results in heads
B = {(H,H), (H,T)}
is
P (A|B) = P (A B)/P (B)
(classical def.)
(1/4)/(1/2) = 1/2.
Winter 2016-2017
41 / 56
Conditional Probability
The multiplication rule allows one to factorize the joint probability for the
events A and B into
the conditional probability for event A, given event B and
the unconditional probability of B.
Theorem (1.11, Multiplication Rule)
Let A and B be any two events in S for which P (B) 6= 0. Then
P (A B) = P (A | B)P (B).
Winter 2016-2017
42 / 56
Conditional Probability
Winter 2016-2017
43 / 56
Conditional Probability
We can do better
Winter 2016-2017
44 / 56
Independence
Outline
Probability
Conditional Probability
Independence
Winter 2016-2017
45 / 56
Independence
This is important
Definition (Independence of events, 2event case)
Let A and B be two events in S. Then A and B are independent iff
P (A B) = P (A)P (B). If A and B are not independent, A and B are
said to be dependent events.
Independence of A and B implies
P (A|B) = P (A B)/P (B) = P (A)P (B)/P (B) = P (A), if P (B) > 0
P (B|A) = P (B A)/P (A) = P (B)P (A)/P (A) = P (B), if P (A) > 0.
Thus the probability of event A occurring is unaffected by the occurrence
of event B, and vice versa.
Winter 2016-2017
46 / 56
Independence
There is more
Winter 2016-2017
47 / 56
Independence
Example: Gender
The work force of a company has the following distribution among type
and gender of workers:
Sex
Male
Female
Total
Sales
825
1,675
2,500
Type of worker
Clerical Production
675
750
825
250
1,500
1,000
Total
2,250
2,750
5000
Winter 2016-2017
48 / 56
Independence
More events
Definition (Independence of events, nevent case)
Let A1 , A2 , . . . , An , be events in the sample space S. The events
A1 , A2 , . . . , An are (jointly) independent iff
Y
P (jJ Aj ) =
P (Aj ) , for all subsets J {1, 2, . . . , n}
jJ
Winter 2016-2017
49 / 56
Independence
A counterexample
Let the sample space S consists of all permutations of the letters a, b, c
along with three triples of each letter, that is,
S = {aaa, bbb, ccc, abc, bca, cba, acb, bac, cab}.
Furthermore, let each element of S have probability 1/9. Consider the
events
Ai = {i th place in the triple is occupied by a}.
According to the classical probability we obtain for all i = 1, 2, 3
P (Ai ) = 3/9 = 1/3,
Winter 2016-2017
50 / 56
Outline
Probability
Conditional Probability
Independence
Winter 2016-2017
51 / 56
Winter 2016-2017
52 / 56
The total probability rule states that the total (unconditional) probability
of an event A can be represented by the sum of conditional probabilities
given the events Bi weighted by the unconditional probabilities P (Bi ).
Winter 2016-2017
53 / 56
A corollary
P (Bj | A) = P
j I.
Hence, Bayess law provides the means for updating the probability of the
event Bj , given the signal that the event A occurs.
Winter 2016-2017
54 / 56
QUITE! useful
Do e.g. a blood test for some disease. Let A be the event that the test
result is positive and B be the event that the individual has the disease.
The test detects the disease with prob. 0.98 if the disease is, in fact,
in the individual being tested: P (A|B) = 0.98.
The test yields a false positive result for 1 percent of the healthy
= 0.01.
subjects: P (A|B)
Finally, 0.1 percent of the population has the disease, P (B) = 0.001.
If the test result is positive, what is the actual probability that a randomly
chosen person to be tested actually has the disease?
The application of Bayess rule yields
P (B | A) =
.98 .001
P (A | B)P (B)
P (B)
= .98 .001 + .01 .999 = .089.
P (A | B)P (B) + P (A | B)
| {z }
1P (B)
Winter 2016-2017
55 / 56
Winter 2016-2017
56 / 56