Kolmogorov Smirnov

KOLMOGOROV-SMIRNOV TEST
An alternate goodness of fit test developed by

A. Kolmogorov and N.V. Smirnov which is
based on a comparison of the observed
sample
cumulative
relative
frequency
distribution with the hypothetical population
cumulative distribution function specified by
the null hypothesis
A fully non-parametric test for comparing two
distributions
Advantages of Kolmogorov-Smirnov
It is non-parametric and hence robust
It does not rely on the means location only
(like the t-test)
It works for non-normal data (the t-test can
fail if the data is too far from normal)
It is not sensitive to scaling
It is more powerful than 2
However, it is less sensitive than t if the
data is indeed normal
The Kolmogorov-Smirnov Test

Formula:
D=FS
where :
F is the population cumulative distribution function
S is the sample cumulative relative frequency
distribution function
and D is the largest absolute difference between
cumulative observed and theoretical frequencies
If D < D, we do not reject Ho; if D > D, we

reject Ho in favor of H1.
The hypothesis regarding the distributional form
is rejected at the chosen significance level
(alpha) if the test statistic, D, is greater than the
critical value obtained from a table.
Mr. Bond used a computer

to generate n = 20 random
numbers. The numbers were
supposed to be uniformly
distributed between 0 to 10.
The 20 sample values xi
were put in order and are
shown in the table. Mr. Bond
is worried that there might
be an error in the computer
program, and he wants to
test the null hypothesis that
the sample of data was
selected from a uniform
distribution.
Let = .05
xi
.8
.08
.05
1.6
.16
.10
1.7
.17
.15
1.9
.19
.20
2.3
.23
.25
4.0
.40
.30
4.5
.45
.35
4.7
.47
.40
5.3
.53
.45
10
5.4
.54
.50
11
6.2
.62
.55
12
6.4
.64
.60
13
6.7
.67
.65
14
6.8
.68
.70
15
7.9
.79
.75
16
8.4
.84
.80
17
9.0
.90
.85
18
9.1
.91
.90
19
9.7
.97
.95
20
9.8
.98
1.00
To test Ho, we need to

calculate S and F for each
observed value of the random
variable X. If X is uniformly
distributed between 0 and 10,
then F represents the area
under the uniform density
function between 0 and x. The
hypothesized uniform density
function has a height of 1/10
and ranges from 0 to 10
xi
.8
1.6
1.7
1.9
2.3
4.0
4.5
4.7
5.3
10
5.4
11
6.2
12
6.4
13
6.7
14
6.8
15
7.9
16
8.4
17
9.0
18
9.1
19
9.7
20
9.8
F (xi/10)
The cumulative
distribution function F
represents the area
under the density
function between 0 and
x. This area is x/10, so
we obtain the null
hypohesis.
xi
F (xi/10)
.8
.08
1.6
.16
1.7
.17
1.9
.19
2.3
.23
4.0
.40
4.5
.45
4.7
.47
5.3
.53
10
5.4
.54
11
6.2
.62
12
6.4
.64
13
6.7
.67
14
6.8
.68
15
7.9
.79
16
8.4
.84
17
9.0
.90
18
9.1
.91
19
9.7
.97
20
9.8
.98
8
i
xi
F (xi/10)
S (i/20)
.8
.08
.05
1.6
.16
.10
1.7
.17
.15
1.9
.19
.20
2.3
.23
.25
4.0
.40
.30
4.5
.45
.35
4.7
.47
.40
5.3
.53
.45
10
5.4
.54
.50
11
6.2
.62
.55
12
6.4
.64
.60
13
6.7
.67
.65
14
6.8
.68
.70
15
7.9
.79
.75
16
8.4
.84
.80
17
9.0
.90
.85
18
9.1
.91
.90
19
9.7
.97
.95
20
9.8
.98
1.00
The data
indicates
that the
greatest
difference
occurs just
prior to x =
40.
xi
F (xi/10)
S (i/20)
.8
.08
.05
1.6
.16
.10
1.7
.17
.15
1.9
.19
.20
2.3
.23
.25
4.0
.40
.30
4.5
.45
.35
4.7
.47
.40
5.3
.53
.45
10
5.4
.54
.50
11
6.2
.62
.55
12
6.4
.64
.60
13
6.7
.67
.65
14
6.8
.68
.70
15
7.9
.79
.75
16
8.4
.84
.80
17
9.0
.90
.85
18
9.1
.91
.90
19
9.7
.97
.95
20
9.8
.98
1.00
Solution:
D=FS
D = .40 - .25
D = .15
11
SAMPLE SIZE
(N)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
25
30
35
OVER 35
.20
.900
.684
.565
.494
.446
.410
.381
.358
.339
.322
.307
.295
.284
.274
.266
.258
.250
.244
.237
.231
.210
.190
.180
1.07
___
N
LEVEL OF SIGNIFICANCE FOR D = MAXIMUM [ F0(X) - Sn(X) ]

.15
.10
.05
.925
.950
.975
.726
.776
.842
.597
.642
.708
.525
.564
.624
.474
.510
.565
.436
.470
.521
.405
.438
.486
.381
.411
.457
.360
.388
.432
.342
.368
.410
.326
.352
.391
.313
.338
.375
.302
.325
.361
.292
.314
.349
.283
.304
.338
.274
.295
.328
.266
.286
.318
.259
.278
.309
.252
.272
.301
.246
.264
.294
.220
.240
.270
.200
.220
.240
.190
.210
.230
1.14
1.22
1.36
___
___
___
N
N
N
.01
.995
.929
.828
.733
.669
.618
.577
.543
.514
.490
.468
.450
.433
.418
.404
.392
.381
.371
.363
.356
.320
.290
.270
1.63
___
N
12
Since D (.15) is less than the critical value

obtained from the table (.294) , we do not reject
the H0.

Kolmogorov Smirnov

Загружено:

Сведения о документе

Исходное описание:

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Kolmogorov Smirnov

Загружено:

Авторское право:

Доступные форматы

KOLMOGOROV-SMIRNOV TEST

An alternate goodness of fit test developed by

The Kolmogorov-Smirnov Test

If D < D, we do not reject Ho; if D > D, we

Mr. Bond used a computer

To test Ho, we need to

LEVEL OF SIGNIFICANCE FOR D = MAXIMUM [ F0(X) - Sn(X) ]

Since D (.15) is less than the critical value

Вам также может понравиться