Вы находитесь на странице: 1из 16

Chi-square Test and

Correlation

Statistics 101 - Villejo

Objectives
Know the assumptions of the chi-square test for
independence.
Perform a chi-square test for independence.
Learn the concept of correlation.
Compute for and interpret the coefficient of
correlation.
Perform the test of significance on the coefficient of
correlation.

Chi-square Distribution
a. Like the t-distribution, it has a single parameter called
the degrees of freedom.
b. The distribution is skewed to the right. As the degrees of
freedom increases, its distribution becomes more
symmetric.
c. Its mean is equal to its degrees of freedom. Its variances
is twice its degrees of freedom.
d. Notation: If X is a random variable that follows a chisquare
distribution with v degrees of freedom, we write
2
X~ ().
e. If X~2 , then 2 () satisfies the condition that
P(X > 2 ()) =

or

P(X 2 ()) = 1 -

Example
Suppose X~2 , determine the following:
a. P(X > 18.307)
b. P(X < 20.483)

c. 2 0.01 (10)

The chi-square test for independence is used to


determine whether two categorical variables are
related.
Example:

Remarks
An r x c contingency table has r rows and c
columns
The marginal frequencies are the row and
column totals

Procedure in testing for independence


1. State the null and alternative hypotheses:
Ho: the two variables are independent
Ha: the two variables are not independent
2. Choose the level of significance

3. The test statistic is:

where the Ois are the observed frequencies and


the Eis are the expected frequencies given by:

Decision Rule: Reject Ho if 2 > 2(v), where


v = (r 1) x (c 1).

4. Compute for the value of the test statistic


5. Make the decision (whether the two variables are
independent or not). Express this decision in terms
of the problem.

The chi-square test for independence is valid only if


1. At least 80% of the cells have expected
frequencies 5.
2. No cell has expected frequency 1
For a 2 x 2 contingency table, we use Yates
correction for continuity:

Example. People are chosen at random from


the state of Illinois until 1000 people have been
classified as to whether they are of Protestant,
Catholic, or Jewish faith, and whether or not they
worship regularly. Test at 0.05 level of significance if
religion is independent of pattern of worship.

Definition
The coefficient of correlation is a measure of the
strength of the linear relationship existing between
two variables, X and Y, that is independent of their
respective scales of measurement.
This is denoted by the Greek letter (rho), where -1
1.

= 1 perfect positive linear relationship


= -1 perfect negative linear relationship
= 0 no linear relationship between
X and Y

Remarks
The coefficient of correlation can establish the
strength of the linear relationship between two
quantitative variables, but it can not establish
causality!
We cannot say X causes Y or Y causes X.

Pearson product-moment coefficient of correlation:

Testing the Correlation Coefficient

Вам также может понравиться