Вы находитесь на странице: 1из 14

# Probability versus Statistical Science

Capture Recapture

## Text: Rice - Mathematical Statistics and Data Analysis

Coverage: At least Chapters 8, 9 & 13 [skip Bayesian sections; Bootstrap (?)]
Exposure to some R
Evaluation: Weekly Quizes [20%], Mid-term [30%], Final [50%]
Weekly Assignments: Textbook questions on which quiz is based
Weekly tutorials: Q&A for assigned questions, review of material, quiz
My office hours: SS 6019

1 / 14

## Probability versus Statistical Science

Capture Recapture

Review

## Mathematics: calculus and algebra

Special subset of mathematics:

Probability
Axioms, random variables,
Densities, distributions
Expected value, variance, skewness
Normal theory

## Axioms: for & A, A1 , A2

P() = 1
P(A) 0 A
A1 , A2 and disjoint
2 ) = P(A1 ) + P(A2 )
P(A1 A
Example: density function f (X )
fR (X ) 0 X X

f (u)du = 1

2 / 14

## Probability versus Statistical Science

Capture Recapture

## Bits [0-1] transmitted over noisy channel

Chance a bit is flipped with probability p
To improve communication the receiver uses a majority decoded
Bit is sent an odd # of times, say n = 5
Let X be the number of times the bit is flipped
So the bit is communicated correctly if X 2
Say we know from the physical properties of the channel that p = 0.1
Then X Bin(5, 0.1) and P(X 2) = .9914
Transmission of the bit has improved from 90% to over 99% success rate
The use of simply probability has been helpful in improving communication
No guessing - were not answering anything that was previously unknown

3 / 14

## Probability versus Statistical Science

Capture Recapture

## Probability vs. Statistical Science

Parameter of scientific interest
Data X from a process modelled using
Probability:
Fixed and known
X Random and unknown
Straightfoward [no inference]

Statistical Inference:
Unknown (Fixed or Random?)
X Observed: Fixed and known
Inference for Guess in a formal or disciplined way
Philosophical question: If is unknown is it fixed or random?
Subjective or Objective reality; Bayesian or Frequentist
Inference required through [statistical/quantitative] reasoning/thinking
Struggle with the nature of evidence and truth in the face of uncertainty

4 / 14

## Probability versus Statistical Science

Capture Recapture

A Wildlife Study

## For the population

N
Wildlife Population - Caribou
Population size: N
Capture Caribou and tag them: T
Release the caribou
Wait

N T

## For the recaptured Caribou

Recapture n Caribou

## Count how many are tagged: t

Whats the distribution of t?

n t

t Hypergeometric[N , T , n]

5 / 14

## Probability versus Statistical Science

Capture Recapture

## The HyperGeometric Distribution

t Hypergeometric[N , T , n]


PrN (t) =

> phyper(c(2,6),20,80,10)

 

N t
T
nt
  t
N
n

Suppose N = 100
We tag T = 20
Recapture n = 10

[1] 0.6812201
[2] 0.9996083
> phyper(2,20,80,10)-phyper(1,20,80,10)
[1] 0.3181706

## Have t {0, 1, . . . , 10}

> phyper(6,20,80,10)-phyper(5,20,80,10)

[1] 0.00354136

6 / 14

## Probability versus Statistical Science

Capture Recapture

The Density
Plot PrN (t)

0.30

0.25
0.20
0.10

0.15

0.05

HyperGeometric Probability

0.00

10

t # tagged

Suppose tobs = 6

7 / 14

## Probability versus Statistical Science

Capture Recapture

## Suppose tobs = 6 - Do we really know N ?

0.30
0.20
0.10

t # tagged

10

10

0.20

0.10

N=40

0.00

0.20

N=60

0.10

t # tagged

0.00

t # tagged

10

0.00

HyperGeometric Probability

HyperGeometric Probability

0.30
0.20
0.10

HyperGeometric Probability

N=80

0.00

HyperGeometric Probability

N=100

10

t # tagged

8 / 14

## Probability versus Statistical Science

Capture Recapture

t # tagged

10

0.4

10

0.3

0.2

0.1

0.30
0.20
0.10

0.00

N=25

N=30

t # tagged

t # tagged

0.20

10

0.0

0.10

0.00

HyperGeometric Probability

HyperGeometric Probability

0.10

0.20

HyperGeometric Probability

N=35

0.00

HyperGeometric Probability

N=40

10

t # tagged

9 / 14

## Probability versus Statistical Science

Capture Recapture

The Likelihood
Wait a minute!!
Why are we plotting t vs PrN (t) for various N
Lets plot N vs PrN (t) at t = tobs = 6

0.25
0.05

0.10

0.15

0.20

0.00

HyperGeometric Probability[Likelihood]

40

60

80

100

N Poulation Size

10 / 14

## Probability versus Statistical Science

Capture Recapture

The Likelihood

0.30

## Repeat the plot for N {20, . . . , 100}

0.25

0.20

0.15

0.10

0.05

0.00

HyperGeometric Probability[Likelihood]

20

40

60

80

100

N Poulation Size

11 / 14

## Probability versus Statistical Science

Capture Recapture

Evidence of discrimination

## 48 Files: 24 women, 24 men

Randomly assigned to 48 male

supervisors

Promote
Hold

Male
21
3
24

Female
14
10
24

35
13
48

## Is there any evidence of bias?

Could 21 & 14 happen by chance?

12 / 14

## Probability versus Statistical Science

Capture Recapture

Evidence of discrimination

## Could 21 & 14 happen by chance?

Sure! In fact, so could 24 & 0
But unlikely in the absence of bias
But is 21 & 14 unlikely?
We need to quantify this
Attach probabilities to outcomes
In the absence of bias

## Assume theres no bias

Then we simply have 35 promoters
And 21 of them were given Male files
Whats the probability of this?
And from this probability what can

## we infer about the presence or

absence of bias?
This is Statistical Inference

13 / 14

## Probability versus Statistical Science

Capture Recapture

Modelling

## Have a population of 48 assessors

Of these 35 are promoters

## X Hypergeometric[48, 35, 24]

But only if theres no bias!!!
So we can now compute

48

35

## probabilities and, in a disciplined or

formal way, consider the question:

13

P(X = 21) = .021

24

X = 21

Evidence of bias

x
P(X=x)

18
0.24

19
0.16

20
0.07

14 / 14