Вы находитесь на странице: 1из 26

Count Data. Data obtained by counting, as contrasted to data obtained by performing measurements on continuous scales.

Count

data

are

also

referred

to

as

enumeration data.

Statistic

for

test

concerning

differences

among proportions (CHI – SQUARE STATISTICS, X 2 )

X

2

0e

2

e

where: 0 = observed frequency e = expected frequency

Contingency Table. (Test for independence)

The

X2

statistic

plays

an

important role in many other problems where information is obtained by counting rather than measuring. This method we shall describe here applies to two kinds of problems, which differ conceptually but are analyzed the same way.

In the first kind of problem we deal with

permitting

more

than

two

possible

trials

outcomes. For instance, the weather can get better, remain the same or get worse; an undergraduate can be a freshman, a sophomore, a junior, or a senior; and a movie may be rated G, PG, R or X.

We could say that we are dealing with multinomial (rather than binomial) trials.

Also,

in

the

illustration

of

the

preceding section, each worker might

whether

more serious

economic problem whether inflation is a

more serious

economic

unemployment, or whether he or she

is

this might have

resulted in the following table.

been

is

a

than

inflation,

have

unemployment

problem

than

undecided

and

Luzon

Visayas

Mindanao

Unemployment

 57 53 44 72 40 48 71 57 58

Undecided

Inflation

TOTAL

200

150

150

We refer to this kind of table as a 3 x 3 table (where 3 x 3 is read “3 x

3”), because it has 3 horizontal rows and 3 vertical columns; more generally, when there are r horizontal rows and c vertical columns, we refer

to the table as an r x c table. Here, as in the table analyzed in the preceding

section,

representing the sample sizes, are

hand, the row

totals depend on the responses of the persons interviewed, and, hence, on chance.

totals

the

column

fixed.

On

the other

A.

To show how

table is

an

r

x

c

analyzed, let us begin by illustrating

of

an

expected

cell

the calculation frequency.

The expected frequency for any cell of a contingency table may be obtained by multiplying the total of the row to which it belongs by the total of the column to which it belongs and then dividing by the grand total for the entire table. Degrees of freedom, df = (r-1) (c-1).

1.
2.

Solution to Example in Chi-Square:

H

: For each alternative (unemployment,

0

undecided, and inflation), the probabilities are the same for the three parts of the country.

H

A

:

For

at least one

not the

 alternative, the same for the

probabilities are three country. 3. Test Statistics:

X 2 < X c 2 X 2 > X c 2

: NS : Accept H 0

:

S : Reject H 0

4. Rejection Region: @ 0.01 level of significance df = (r-1) (c-1) = (3 – 1) (3 – 1) = 4 X 2 = 13.277

5. Calculation of Test Statistics:
Luzon
Visayas
Mindanao
Total
Unemployment
57
53
44
154
Undecided
72
40
48
160
Inflation
71
57
58
186
Total
200
150
150
500
 o 57 72 71 53 40 e o – e (o – e)2 (o-e)2 e 61.6 -4.60 21.16 0.3435 64.0 8.00 64.00 1.0000 74.4 - 3.40 11.60 0.1560 46.2 6.80 46.24 1.0009 48.0 - 8.00 64.00 1.3333 57 55.8 1.20 1.44 0.0258 44 46.2 -2.20 4.84 1.1048 48 48.0 0 0 0 58 55.8 2.20 4.84 1.0867

2

X 

0 e

2

e

4.051

4.0510

6. Conclusion

Since X 2 = 4.051 does not exceed X

2 =

c

13.277, the null hypothesis is accepted; the

and

difference between the observed

expected frequencies may well be due to chance. In the second kind of problem where the method of this section applies, the column totals as well as the row totals depend on chance. To give an example, suppose that a sociologist wants to determine whether there is a relationship between the intelligence of boys who have gone through a special job– training program and their subsequent performance in their jobs, and that a sample of 400 cases taken from very extensive files yielded the following results:

Poor
Fair
Good
Total
Below Average
67
64
25
156
Average
42
73
56
174
Above Average
10
23
37
70
Total
119
163
118
400

3.

Solution

1. H

: Intelligence and on-the-job

0

performance are independent

2. H A : Intelligence and on-the-job performance are not independent.

Test Statistics:

X 2 < X c 2 : NS : Accept H o

X 2 > X c 2 :

S : Reject H o

4. Rejection Region: @ 0.01 Level of Significance

df = (r – 1) ( c – 1) =

X c 2

= 13.277

(3 – 1) (3 – 1) = 4

5. Calculation of Test Statistics:
PERFORMANCE
Poor
Fair
Good
Total
Below Average
67
64
25
156
Average
42
76
56
174
Above Average
10
23
27
70
Total
119
163
118
400
o
e
o - e
(o – e) 2
( o – e) 2
e
67
46.4
20.6
424.36
9.1457
64
63.6
0.4
0.16
0.0025
25
46.0
-21.0
441.00
9.5869
42
51.8
-9.8
96.04
1.8540
76
70.9
5.1
26.01
0.3668
56
51.3
4.7
22.09
0.4306
10
20.8
10.8
116.64
5.6077
23
28.5
-5.5
30.25
1.0614
37
20.7
16.3
265.69
12.8353

40.8909

0 e

2

2

X 

40.8909

e

6. Conclusion:

Since X 2 = 40.89 which exceeds

the null hypothesis is

rejected: we conclude that there is a relationship between IQ and on-the-

X c 2 =

13.277,

job performance.

Two-Fold Test ( 2 x2 contingency table)

2

 x  df 

BC

2

  A  B  C  D    A  C    B  D   r  1 c  1
Problem
1:
Is
there
any
significance
relationship b/n passing the board exam
and success in career?
Fail
Pass
Total
Successful
20
40
60
Unsuccessful
25
15
40
Total
45
55
100
1.

Solution to Two – Fold Examples:

H 0 : There is no significant relationship

between passing the board examination and success in career. 2. H A : There is a significant relationship between passing the board examination and success in career. 3. Test Statistics:

4.

X 2 < X c 2 : NS : Accept Ho

X 2 > X c 2 :

S : Reject Region:

Ho

Rejection

@

0.05

Level

of

Significance df = (r – 1) (c – 1) = (2 – 1) (2 – 1) = 1

X 2 c = 3.841

5. Calculation of Test Statistics
Fail
Pass
Total
Successful
20
40
60
Unsuccessful
25
15
40
Total
45
55
100
100[(20 15)
x
(25 40)] 2
x
2
X

(60)(40)(45)(55)

6. Conclusion:

Since the computed value of chi- square is greater than the critical value, therefore, there is a significant relationship between passing the board exam and success in career, hence, the null hypothesis is rejected.

Problem
2:
Is
there any significant
relationship between sex and effectiveness
in management?
Sex
Effectiveness
Male
Female
Total
Effective
38
37
75
Not Effective
40
35
75
Total
78
72
150
1.
2.

Solution to Problem No. 2:

H 0 between

:

There

sex

is

no significant relationship

effectiveness in

and

management.

There

H A between

:

is

a significant relationship and effectiveness in

sex

management.

3. Test Statistics:

X 2 < X c 2 : NS : Accept H 0

X 2 > X c 2 :

S : Reject H 0

4. Rejection

Region:

@

0.05

Level

of

Significance df = ( r – 1) (c – 1) = (2 – 1) (2 –1) = 1

5. Calculation of Test Statistics

 SEX Effectiven MALE FEMALE Total ess Effective 38 37 75 Not 40 35 75 Effective Total 78 72 150
150[(38 35)
x
(37 40)] 2
x
2
X
(75)(75)(78)(72)
 0.1068
6. Conclusion:

Since X 2 = 0.1068 < than X c 2 = 3.841, then the null hypothesis is accepted, therefore is no significant relationship between sex and effectiveness in management.