Вы находитесь на странице: 1из 15

Nonparametric methods (cont.

Session XX

Problem 2
A manufacturer of toys changed the type of plastic molding machines it was using because a new one gave evidence of being more economical. As the Christmas season began, however, productivity seemed somewhat lower than last year. Because production records of the past years were readily available, the production manager decided to compare the monthly output for the 15 months when the old machines were used and the 11 months of production so far this year. Records show these output amounts with the old and new machines.
Monthly output in Units ------------------------------------------------------------------------------------------------Old Machines New Machines -----------------------------------------------------------------------------------

992 945 938 1027 892 983 1014 1258

966 889 972 940 873 1016 897

965 1054 912 850 796 911 877 902

956 900 938

Problem 2: solution
H 0 : m old = m new vs H1 : m old > m new

mold average (median) output with old machine mnew average (median) output with new machine

Mann-Whitney U-test
For comparing averages of two populations Combine all data and rank them.
Smallest observation is assigned rank 1; second smallest observation is ranked 2 Highest observation is ranked n Give mid- rank in case of tie.

If the average of the first population is smaller, then the combined rank of observations from first populations would be small Need sample sizes to be 10 or more for normal approximation (after standardizing). Tables are available for smaller sample. Also known as Wilcoxon rank sum test

Calculation
R1: Total rank of observations drawn from population 1 R2: Total rank of observations drawn from population 2

n1 (n1 + 1) Test Statistic U = n1n2 + R1 2 = No. of (X i , Y j ) pairs with X i < Y j ( R1 = rank sum of all X obs) n1n2 n1n2 (n1 + n2 + 1) Under H 0 , U = , U = 2 12 U- U and has N(0,1) distribution U

Problem 2
H 0 : m old = m new vs H1 : m old > m new
11 12 n2 (n2 + 1) Test Statistic U = n1n2 + R2 = 231 R2 R2 = 15 11 + 2 2 U - U At = 0.1, the C.R is > 1.28

n1n2 165 n1n2 (n1 + n2 + 1) where, U = = , U = = 19.27 2 2 12

R2=115.5, Observed value of the T.S. is

(231 115.5) 82.5 = 1.71 19.27

So Reject H0 at 10% level and conclude that the change has reduced the average output level

n1 (n1 + 1) Test Statistic U = n1n2 + R1 2 n1n2 n1n2 (n1 + n2 + 1) Under H 0 , U = , U = 2 12 U - U and has N(0,1) distribution U = n n + n2 (n2 + 1) R
(n1 + n2 )(n1 + n2 + 1) 2 n (n + n + 1) Under H 0 : E ( R1 ) = 1 1 2 2 R1 + R2 =

Understanding Equivalent calculations

1 2

U + U = n1n2 nn nn U 1 2 = 1 2 U 2 2

The p-value and hence the conclusion would be the same


irrespective of what form of the test statistic is used, i.e.

R1 or R2

or U or U

But need to be careful on (a) left or right tailed, depending on the

form (b) Mean to be subtracted (S.E would be the same)

14-34 in Aczel -Sounderpandyan


Test if the (average) current ratio for the 3 industries are the same.

mean A B C 1.38 1.55 2.33 2.5 1.9 2 1.22 2.11 1.98 1.61

sd

1.719 0.324 2.512 0.356 1.287 0.238

2.79 3.01 1.99 2.45

1.06 1.37 1.09 1.65 1.44 1.11

Kruskal-Wallis Test
For comparing means of more than 2 populations alternative to ANOVA Use data is ordinal or if assumptions of ANOVA are violated Pull all observations and rank them Compute total of the ranks of observations coming from 1st, 2nd ,3rd populations Null distribution is Chi-square with k-1 d.f
Ri2 12 T.S. is n 3(n + 1 ) n(n + 1 ) i

Solving 14-34 using Kruskal-Wallis


Industry current ratio rank

A A A A A A A A B B B B B B C C C C C C

1.38 1.55 1.9 2 1.22 2.11 1.98 1.61 2.33 2.5 2.79 3.01 1.99 2.45 1.06 1.37 1.09 1.65 1.44 1.11

6 8 11 14 4 15 12 9 16 18 19 20 13 17 1 5 2 10 7 3

A B C total

ranksum sample size R^2/n 79 8 780.125 103 28 210 6 6 20 1768.167 130.6667 2678.958

12 2678.96 T.S. is 3 21 = 13.54 20 21 p value = P ( 2 df > 13.54) = 0.001 2

Problem 3

A sequence of small glass sculptures was inspected for shipping damage. The sequence of acceptable and damaged pieces was as follows: D,A,A,A,D,D,D,D,D,A,A,D,D,A,A,A,A,D,A,A,D,D,D,D,D

Test for the randomness of the damage to the shipment using the 0.05 significance level.

Run test: A test for randomness


Which of the following sequences appear to be random?
HTHTHTHTHTHTHTHTHTHTHTHTHTHTHTHT HHHHHHHHHHHHHHTTTTTTTTTTTTTTTTTT HHHHHHHHHHHTTTTTTTTTTTHHHHHHHHH

NONE! How to determine objectively or statistically? Calculate the no. of runs A run is a sequence of identical symbols/events Too many (or few) runs indicate lack of randomness
HTHT HTHT HTHT HTHT HTHT HTHT HTHT HT HHHHHHHHHHHHHHTTTTTTTTTTTTTTTTTT HHHHHHHHHHHTTTTTTTTTTTHHHHHHHHH

Run test (cont.)


How to determine too many or too few? Acceptable no. of runs depend on n1, n2
If H 0 (the sequence is 'randomly' mixed) is true, r- r has approximately

N(0,1) distribution, provided either n1or n 2 moderately large ( 10). Small sample distributions are avaliable (not in text).

2n1n2 r = +1 n1 + n2

2n1n2 (2n1n2 n1 n2 ) r = (n1 + n2 )2 (n1 + n2 1)

Solution to Problem 3
nA = 11, nD = 14, 2 1114 r = + 1 = 13.32, 11 + 14 2 1114(2 1114 25) r = = 2.41 2 25 24 At = 0.05, the C.R. is | Z | > 1.96. 9 - 13.32 The observed r = 9, and value of the T.S. is = 1.79. 2.41 So at 5% level we conclude that damages occur randomly

Data summarization And presentation

Expected Value in decision making Decision trees Discrete: General, Binomial, Poisson Random variable And its Distribution
Continuous: General, Normal,Exponential

Probability

T, Chi-square, F Confidence interval/ Testing hypothesis 1 or 2 sample

Sampling Sampling distribution of


X, p, S 2

ANOVA Test for indep/homogeneity

Goodness of Fit

NP

Вам также может понравиться