Вы находитесь на странице: 1из 31

Click to edit Master subtitle style

ASSOCIATION OF ATTRIBUTES
Chi-Square
MUHAMMAD USMAN
ROLL 553-07-09
11

Data Types
Data
Quantitative Qualitative
Continuous Discrete
22

Quantitative Data

Quantitative Data usually consist of


measurable characteristics called variables. For
example,

The annual income of a family,

The weight/ height of a student,

The age of a child,

The price of a commodity


33

Qualitative Data

Qualitative Data can not be measured accurately


but can be divided into classes and their number
in each class can be counted. It consist of non-
measurable characteristics. A characteristic
which can be measured numerically (but only its
presence or absence can be described) is called
an Attribute. Nominal or ordinal scale, For
example,

Division by Gender (Male, Female)

Marital Status (Single, Married, Divorced, or


Widowed)

Employment Status (Employed, Unemployed)


44

Measurement Scales
The four scales of measurement are

Nominal Scale

Ordinal or Ranking Scale

Interval Scale

Ratio Scale
55

Nominal Scale

It is the classification of the observations into


mutually exclusive qualitative categories, For
example,

Students are classified as MALE or


FEMALE, Number 1 and 2 may also be used
to identify these two categories.

Rainfall may be classified as HEAVY,


MODERATE & LIGHT, Numbers 1, 2 and 3
might be used to denoted three classes.
66
NOTE: There is no particular order for grouping/
classifications here..

Ordinal or Ranking Scale

It includes the characteristic of a nominal


scale and in addition has a property of
ordering or ranking, For example,

The performance of Students is rated as


EXCELLENT, GOOD, FAIR or POOR, etc
here number 1, 2, 3 & 4 are used to indicate
ranks.
77
NOTE: The only relation that holds
between any pair of categories is that
of greater than (or more preferred)

Interval Scale

A measurement scale possessing a constant


interval size (distance) but not a true zero
point, is called an interval scale. For
example,

Temperature measured is an outstanding


example of interval scale because a same
difference exists between 20C and 40C as
between 5C and 25C. It can not be said that a
temp of 40c is twice as hot as a temperature of
20c.

The ratio 40/20 has no meanings.


88
NOTE: The arithmetic operation
addition, subtraction etc. are
meaningful.

Ratio Scale

It is a special kind of an interval scale where


the scale of measurement has a true zero
point as its origin.

The ratio scale is used to measure weight,


volume, length, distance, money, etc in
which zero point is meaningful.
99
NOTE: The zero point is meaningful for
Ratio scale but not for Interval
scale..

Hypothesis Tests
Qualitative Data
Qualitativ
e
Dat
a
Z
Test
Z
Test

2

Tes
t
Proportio
n
Independenc
e
1
pop.

2

Tes
t
More
than
2 pop.
2
pop.
1010

ASSUMPTION

Random sample selected from a binomial


population Normal approximation can be used if
H0: p <= p0 or p = p0 or p >= p0
H1: p > p0 or p p0 or p < p0
Z-test statistic
where
1111
0 0

15 and 15 np nq
0
0 0
p p
Z
p q
n

number of successes

sample size
x
p
n

Hypothesis for One Proportion

Ha
Hypothes
is
Research
Questions
No
Difference
Any
Difference
Pop 1


Pop
2
Pop 1 < Pop 2
Pop
1


Pop
2
Pop 1 > Pop
2
H0
1 2
0 p p
1 2
0 p p
1 2
0 p p
1 2
0 p p >
Z-Test Statistic for Two Proportions
( ) ( )
1 2 1 2
1 2
1 2
1 2

where
1 1

p p p p
X X
Z p
n n
pq
n n

+

+
_
+

,
Hypothesis for Two Proportions

Chi Square Test Basic Idea
1. Compares observed count to expected
count assuming null hypothesis is true
2. Closer observed count is to expected
count, the more likely the H0 is true

2. Test Statistic
( )
( )
2
2
all cells
i i
i
n E n
E n

1
]

Observed (actual) count


Expected
count:
E(ni) = npi,0
3. Degrees of Freedom: k 1
Number of
outcomes
Hypothesized
probability
1. Hypotheses
H0: p1 = p1,0, p2 = p2,0, ..., pk = pk,0
Ha: At least one pi is different from above
Chi Square Test for k proportions

What is the critical 2 value if k = 3, and =.05?

2
0
Upper Tail
Area
D
F
.
99
5
.
9
5
.
0
5
1 .
.
.
0.0
04
3.8
41
2 0.0
10
0.1
03
5.9
91
2
()
If ni = E(ni), 2 0.
0
df = k - 1 = 2
5.9
91
Reject
H0
.05
Finding Critical Value

2 Test of Independence Example
As a realtor you want to determine if house style
and house location are related. At the .05 level of
significance, is there evidence of a relationship?
House Location
House Style Urban Rural Total
Split-Level 63 49 112
Ranch 15 33 48
Total 78 82 160


Shows number of observations from 1 sample
jointly in 2 qualitative variables
House Location
House Style Urban Rural Total
Split-Level 63 49 112
Ranch 15 33 48
Total 78 82 160

Levels of variable 2
Levels of variable 1
Chi Square Test of Independence
Contingencies Table

112
160
Marginal probability =
Expected Count Example
Location
Urban Rural
House Style Obs. Obs. Total
SplitLevel 63 49 112
Ranch 15 33 48
Total 78 82 160

78
160
Marginal probability =
Expected Count Example
112
160
Marginal probability =
Location
Urban Rural
House Style Obs. Obs. Total
SplitLevel 63 49 112
Ranch 15 33 48
Total 78 82 160

Expected Count Example
78
160
Marginal probability
=
112
160
Marginal probability =
Joint probability =
112
160
78
160
Location
Urban Rural
House Style Obs. Obs. Total
SplitLevel 63 49 112
Ranch 15 33 48
Total 78 82 160
Expected count = 160
112
160
78
160
= 54.6

Expected Count Calculation
i j
R C
=
n
ij
E
House
Location

Urb
an
Ru
ral


House
Style
O
bs
.
Ex
p.
O
bs
.
Ex
p.
Tot
al

Sp
lit
- Le
vel
6
3

11278
160
54
.6
4
9

11282
160
57
.4
1
1
2

Ran
ch
1
5

4878
160
23
.4
3
3

4882
160
24
.6
4
8

Tot
al
7
8
7
8
8
2
8
2
1
6
0


Eij 5 in all cells

2 Test of Independence
Solution
House Location
Urban Rural

House Style Obs. Exp. Obs. Exp. Total
Split-Level 63 54.6 49 57.4 112
Ranch 15 23.4 33 24.6 48
Total 78 78 82 82 160

11282
160
4878
160
4882
160
11278
160

[ ] [ ] [ ]
[ ] [ ] [ ]
2
2
all cells
2 2 2
11 11 12 12 22 22
11 12 22
2 2 2
63 54.6 49 57.4 33 24.6
8.41
54.6 57.4 24.6
ij ij
ij
n E
E
n E n E n E
E E E

1
]


+ + +

+ + +

L
L
2 Test of Independence
Solution

2 Test of Independence Solution

H0:

Ha:

df =

Critical Value(s):
Test Statistic:
p-value = ?
Decision:
Conclusion:
2 = 8.41
Reject at = .05
There is evidence of
a relationship

2
0
Reject
H0
No Relationship
Relationship
.05
(2 - 1)(2 - 1) = 1
3.84
1
.05

Yates Correction for Continuity

In applying Chi-square approximation, we are required to


combine the smaller expected counts (<5) with larger
ones.

But in case of 2 classes only, we cannot pool the smaller


frequency into the larger one.

Frank Yates in 1934 showed that the Chi Square


approximation is markedly improved if we use the
following formula

It should only be used when d.f=1 and only one ei is


small.

Chi-Square
Table

Coefficient of Contingency

Chi-Square statistic does not tell anything about the


strength of the association.

For this purpose Karl Pearson (1857-1936) has


defined a coefficient C defined as pearson coefficient of
mean square contingency

where n indicates sample size

This coefficient measures the strength of the association


or dependence of two variables of classification of the
contingency table.

C=0 (Complete Independence)

If (Perfect Association) k is smaller of r & c

C lies between zero and

The larger the value of C the stronger is the association.

C suffers from the disadvantage that it does not reach a


maximum of 1 or the minimum of -1

It should, therefore, not be used to compare associations


among tables with different numbers of categories
Coefficient of Contingency

Phi-Coefficient

Phi Coefficient is defined as

Where chi-square is a pearsons Chi square statistic, and


N is a grand total of the observations.

Phi varies from -1 to 1

0 indicates no association

1 corresponds complete association

-1 corresponds complete inverse association

This coefficient can only be calculated for frequency


data represented in 2 x 2 tables

Cramers Co-efficient of contingency

Cramers co-efficient of contingency is defined as

Where n is total sample size and k is smaller of r & c

If Q=0 variables are completely independent

If Q=1, there is perfect relationship



Thanks

Оценить