Академический Документы
Профессиональный Документы
Культура Документы
4. Hypothesis testing
1/47
Goals
We have seen the basic notions of hypothesis testing:
I Hypotheses H0 /H1 ,
I Type 1/Type 2 error, level and power
I Test statistics and rejection region
I p-value
3/47
Clinical trials
4/47
Notation and modelling
5/47
Hypothesis testing
I Hypotheses:
H0 : vs. H1 :
I We have
I Therefore
X̄n Ȳm ( d c)
⇠ N (0, 1)
6/47
Asymptotic test
I Assume that m = cn and n ! 1
I Using lemma, we also have
(d)
! N (0, 1)
n!1
where
n
X m
X
2 1 2 2 1 2
ˆd = (Xi X̄n ) and ˆc = (Yi Ȳm )
n m
i=1 i=1
156.4 132.7
q = 1.57
5198.4 3867.0
70 + 50
p-value = = 0.0582
8/47
8/47
Small sample size
9/47
2
The distribution
Definition
For a positive integer d, the 2 (pronounced “Kai-squared”)
distribution with d degrees of freedom is the law of the random
2 2 2 iid
variable Z1 + Z2 + . . . + Zd , where Z1 , . . . , Zd ⇠ N (0, 1).
Examples:
I 2
If Z ⇠ Nd (0, Id ), then kZk2 ⇠
I 2 = Exp(1/2).
2
10/47
0.0 0.2 0.4 0.6 0.8 1.0 1.2
0
5
10
15
df=1
20
25
10/47
0.0 0.2 0.4 0.6 0.8 1.0 1.2
0
5
10
15
df=2
20
25
10/47
0.0 0.2 0.4 0.6 0.8 1.0 1.2
0
5
10
15
df=3
20
25
10/47
0.0 0.2 0.4 0.6 0.8 1.0 1.2
0
5
10
15
df=4
20
25
10/47
0.0 0.2 0.4 0.6 0.8 1.0 1.2
0
5
10
15
df=5
20
25
10/47
0.0 0.2 0.4 0.6 0.8 1.0 1.2
0
5
10
15
df=10
20
25
10/47
0.0 0.2 0.4 0.6 0.8 1.0 1.2
0
5
10
15
df=20
20
25
10/47
2
Properties distribution (2)
Definition
For a positive integer d, the 2 (pronounced “Kai-squared”)
distribution with d degrees of freedom is the law of the random
2 2 2 iid
variable Z1 + Z2 + . . . + Zd , where Z1 , . . . , Zd ⇠ N (0, 1).
Properties: If V ⇠ 2, then
k
I IE[V ] =
I var[V ] =
11/47
Important example: the sample variance
I 2
We often prefer the unbiased estimator of :
12/47
Student’s T distribution
Definition
For a positive integer d, the Student’s T distribution with d
degrees of freedom (denoted by td ) is the law of the random
Z 2
variable p , where Z ⇠ N (0, 1), V ⇠ d and Z ?? V (Z is
V /d
independent of V ).
13/47
14/47
Who was Student?
15/47
Student’s T test (one sample, two-sided)
iid 2) 2
I Let X1 , . . . , Xn ⇠ N (µ, where both µ and are
unknown
I We want to test:
H0 : µ = 0, vs H1 : µ 6= 0
I Test statistic:
X̄n
Tn = p =
S̃n
16/47
Student’s T test (one sample, one-sided)
I We want to test:
H0 : µ µ 0 , vs H1 : µ > µ 0
I Test statistic:
X̄n
Tn = p ⇠
S̃n
under H0 .
I Student’s test with (non asymptotic) level ↵ 2 (0, 1):
⇢
↵ = 1I ,
17/47
Two-sample T-test
I Back to our cholesterol example. What happens for small
sample sizes?
I We want to know the distribution of
X̄n Ȳm ( d c)
q
ˆd2 ˆc2
n + m
I We have approximately
X̄n Ȳm ( d c)
q ⇠ tN
ˆd2 ˆc2
n + m
where
2 2 2
ˆd /n + ˆc /m
N= ˆd4
min(n, m)
ˆc4
2
n (n 1)
+ m2 (m 1)
(Welch-Satterthwaite formula)
18/47
Non-asymptotic test
I Example n = 70, m = 50, X̄n = 156.4, Ȳm = 132.7,
2 2
ˆd = 5198.4, ˆc = 3867.0,
156.4 132.7
q = 1.57
5198.4 3867.0
70 + 50
we round to .
I We get
p-value = = 0.0596
19/47
Non-asymptotic test
I Example n = 70, m = 50, X̄n = 156.4, Ȳm = 132.7,
2 2
ˆd = 5198.4, ˆc = 3867.0,
156.4 132.7
q = 1.57
5198.4 3867.0
70 + 50
we round to .
I We get
p-value = = 0.0596
19/47
Discussion
20/47
A test based on the MLE
I If H0 is true, then
⇣ ⌘ (d)
⇥ ˆM LE
✓n ✓0 ! Nd (0, Id )
n!1
21/47
Wald’s test
I Hence,
⇣ ⌘> ⇣ ⌘ (d)
n ˆM LE
✓n ✓0 ˆM LE
I(✓ ) ˆM LE
✓n ✓0 !
n!1
| {z }
Tn
= 1I{Tn > q↵ },
22/47
A test based on the log-likelihood
I Consider an i.i.d. sample X1 , . . . , Xn with statistical model
d
(E, (IP✓ )✓2⇥ ), where ⇥ ✓ IR (d 1).
(0) (0)
for some fixed and given numbers ✓r+1 , . . . , ✓d .
I Let
✓ˆn = argmax `n (✓) (MLE)
✓2⇥
and
ˆc
✓n = argmax `n (✓) (“constrained MLE”)
✓2⇥0
where ⇥0 =
23/47
Likelihood ratio test
Test statistic: ⇣ ⌘
Tn = 2 `n (✓ˆn ) ˆc
` n ( ✓n ) .
Wilks’ Theorem
Assume H0 is true and the MLE technical conditions are satisfied.
Then,
(d)
Tn !
n!1
= 1I{Tn > q↵ },
24/47
Implicit hypotheses
I d
Let X1 , . . . , Xn be i.i.d. random variables and let ✓ 2
IR be
a parameter associated with the distribution of X1 (e.g. a
moment, the parameter of a statistical model, etc...)
25/47
Delta method
I Delta method:
p ⇣ ⌘ (d)
n g(✓ˆn ) g(✓) ! Nk (0, (✓)) ,
n!1
> k⇥k
where (✓) = rg(✓) ⌃(✓)rg(✓) 2 IR .
26/47
Wald’s test for implicit hypotheses
= 1I{ },
27/47
Goodness of fit
28/47
Goodness of fit tests
These are all goodness of fit (GoF) tests: we want to know if the
hypothesized distribution is a good fit for the data.
29/47
The zodiac sign of the most
powerful people is....
Sign Count
Can your zodiac sign Aries 23
predict how successful you Taurus 20
will be later in life? Gemini 18
Fortune magazine collected Cancer 23
the signs of 256 heads of Leo 20
the Fortune 500. Virgo 19
Fyi:
Libra 18
256/12
Scorpio 21
=21.33
Sagittarius 19
Capricorn 22
Aquarius 24
Pisces 29
29/47
The zodiac sign of the most
successful people is....
29/47
Discrete distribution
8 9
< K
X =
I K
K = p = (p1 , . . . , pK ) 2 (0, 1) : pj = 1 .
: ;
j=1
IPp [X = aj ] = pj , j = 1, . . . , K.
30/47
Goodness of fit test
iid
I Let X1 , . . . , Xn ⇠ IPp , for some unknown p 2 K, and let
0
p 2 K be fixed.
I We want to test:
H0 : p = p 0 vs. H1 : p 6= p 0
31/47
Multinomial likelihood
N1 N2 NK
Ln (X1 , . . . , Xn , p) = p1 p2 . . . pK ,
where Nj = #{i = 1, . . . , n : Xi = aj }.
32/47
2
test
p 0
I If H0 is true, then n(p̂ p ) is asymptotically normal, and
the following holds.
Theorem
⇣ ⌘2
K p̂j 0
pj
X (d) 2
n ! K 1.
j=1
p0j n!1
| {z }
Tn
33/47
CDF and empirical CDF
Let X1 , . . . , Xn be i.i.d. real random variables. Recall the cdf of
X1 is defined as:
Definition
The empirical cdf of the sample X1 , . . . , Xn is defined as:
n
X
1
Fn (t) = 1I{Xi t}
n
i=1
#{i = 1, . . . , n : Xi t}
= , 8t 2 IR.
n
34/47
Consistency
35/47
Asymptotic normality
Donsker’s Theorem
If F is continuous, then
p (d)
n sup |Fn (t) F (t)| ! sup |B(t)|,
t2IR n!1 0t1
36/47
Goodness of fit for continuous distributions
37/47
Kolmogorov-Smirnov test
p 0
I Let Tn = sup n Fn (t) F (t) .
t2IR
(d)
I By Donsker’s theorem, if H0 is true, then Tn ! Z,
n!1
where Z has a known distribution (supremum of a Brownian
bridge).
38/47
Computatinal issues
39/47
Pivotal distribution
i.i.d.
I If H0 is true, then U1 , . . . , Un ⇠
p
and Tn = sup n |Gn (x) x|.
0x1
40/47
Quantiles and p-values
(n)
I Estimate the (1 ↵)-quantile q↵ of Tn by taking the sample
(n,M ) 1 M
(1 ↵)-quantile q̂↵ of Tn , . . . , Tn .
41/47
K-S table
Kolmogorov–Smirnov Tables
Critical values, dalpha ;(n)a , of the maximum absolute difference between sample Fn (x) and population F(x)
cumulative distribution.
Level of significance, α
Number of
trials, n 0.10 0.05 0.02 0.01
42/47
Other goodness of fit tests
I Cramér-Von Mises:
Z
2 2
d (Fn , F ) = [Fn (t) F (t)] dt
IR
I Anderson-Darling:
Z 2
2 [Fn (t) F (t)]
d (Fn , F ) = dt
IR F (t)(1 F (t))
43/47
Composite goodness of fit tests
where
2 2
µ̂ = X̄n , ˆ = Sn
and is the cdf of N (µ̂, 2
µ̂,ˆ 2 (t) ˆ ).
44/47
Kolmogorov-Lilliefors test (1)
45/47
K-L table
Carlo calculations, using 1,000 or more samples for each value of N.
46/47
47/47
Quantile-Quantile (QQ) plots (1)
I Provide a visual way to perform GoF tests
I Not formal test but quick and easy check to see if a
distribution is plausible.
I Main idea: we want to check visually if the plot of Fn is close
1
to that of F or equivalently if the plot of Fn is close to that
of F 1 .
I More convenient to check if the points
1 1 1 1 1 2 1 2 1 n 1 1 n 1
F ( ), Fn ( ) , F ( ), Fn ( ) , . . . , F ( ), Fn ( )
n n n n n n
are near the line y = x.
I Fn is not technically invertible but we define
1
Fn (i/n) = X(i) ,
47/47
17/25
Quantile-Quantile (QQ) plots (3)
Figure 2: QQ-plots for samples of sizes 10, 50, 100, 1000, 5000, 10000 from a t15 distribution. The
upper-left figure is for sample size 10, the lower-right is for sample 10000.
47/47
18/25