Вы находитесь на странице: 1из 42

CHI-SQUARE

To:
Sir Shahid Mahmood
By:
Abuzar Tabassum
M.Sc Zoology 3rd semester
10040814-017
University Of Gujrat

Chi-Square Test
A fundamental problem in genetics is determining
whether the experimentally determined data fits
the results expected from theory (i.e. Mendels
laws as expressed in the Punnett square).
A statistical method used to determine
GOODNESS OF FIT
Goodness of fit refers to how close the
observed data are to those predicted from a
hypothesis

Goodness of Fit
Mendel has no way of solving this problem. Shortly after the
rediscovery of his work in 1900, Karl Pearson and R.A. Fisher
developed the chi-square test for this purpose.
The chi-square test is a goodness of fit test: it answers the
question of how well do experimental data fit expectations.
We start with a theory for how the offspring will be
distributed: the null hypothesis. We will discuss the
offspring of a self-pollination of a heterozygote. The null
hypothesis is that the offspring will appear in a ratio of 3/4
dominant to 1/4 recessive.

Formula
To calculate the chi-square statistic following formula is used.

(obs exp)

exp
2

The is the Greek letter chi; the is a sigma; it means to sum


the following terms for all phenotypes.
obs is the number of individuals of the given phenotype observed;
exp is the number of that phenotype expected from the null
hypothesis.

How can you tell if an observed set of


offspring counts is legitimately the result of a
given underlying simple ratio?
For example, you do a cross and see 290
purple flowers and 110 white flowers in the
offspring. This is pretty close to a 3/4 : 1/4
ratio, but how do you formally define "pretty
close"? What about 250:150?

Example
As an example, you count F2 offspring, and get 290 purple and 110 white
flowers. This is a total of 400 (290 + 110) offspring.
We expect a 3/4 : 1/4 ratio. We need to calculate the expected numbers, this
is done by multiplying the total offspring by the expected proportions. This
we expect 400 * 3/4 = 300 purple, and 400 * 1/4 = 100 white.
Thus, for purple, obs = 290 and exp = 300. For white, obs = 110 and exp =
100.
Now it's just a matter of plugging into the formula:

2 = (290 - 300)2 / 300 + (110 - 100)2 / 100


= (-10)2 / 300 + (10)2 / 100
= 100 / 300 + 100 / 100
= 0.333 + 1.000
= 1.333.
This is our chi-square value: now we need to see what it means and how to
use it.

The Critical Question


Using the example here, how can you tell if your 290: 110 offspring
ratio really fits a 3/4 : 1/4 ratio (as expected from selfing a
heterozygote).You cant be certain, but you can at least determine
whether your result is reasonable.

Reasonable
What is a reasonable result is subjective and arbitrary.
For most work a result is said to not differ significantly from
expectations if it could happen at least 1 time in 20. That is, if
the difference between the observed results and the expected
results is small enough that it would be seen at least 1 time in
20 over thousands of experiments, we fail to reject the null
hypothesis.
For technical reasons, we use fail to reject instead of
accept.
1 time in 20 can be written as a probability value p = 0.05,
because 1/20 = 0.05.
Another way of putting this. If your experimental results are
worse than 95% of all similar results, they get rejected because
you may have used an incorrect null hypothesis.

The test statistic is compared to a


theoretical probability distribution
In order to use this distribution
properly you need to determine the
degrees of freedom
If the level of significance read from
the table is greater than .05 or 5%
then your hypothesis is accepted and
the data is useful
The hypothesis is termed the null
hypothesis which states that there is
no substantial statistical deviation
between observed and expected data.

Degrees of Freedom
A critical factor in using the chi-square
test is the degrees of freedom.
Degrees of freedom is the number of
phenotypic possibilities in your cross
minus one.
Or
Degrees of freedom is simply the
number of classes of offspring minus 1.
For our example, there are 2 classes of
offspring: purple and white. Thus,
degrees of freedom (d.f.) = 2 -1 = 1.

Critical Chi-Square
Critical values for chi-square are found on
tables, sorted by degrees of freedom and
probability levels. Be sure to use p = 0.05.
If your calculated chi-square value is
greater than the critical value from the
table, you reject the null hypothesis.
If your chi-square value is less than the
critical value, you fail to reject the null
hypothesis (that is, you accept that your
genetic theory about the expected ratio is
correct).

Chi-Square Table

Using the Table


In our example of 290 purple to 110 white,
we calculated a chi-square value of 1.333,
with 1 degree of freedom.
Looking at the table, 1 d.f. is the first row,
and p = 0.05 is the sixth column. Here we
find the critical chi-square value, 3.841.
Since our calculated chi-square, 1.333, is
less than the critical value, 3.841, we fail
to reject the null hypothesis. Thus, an
observed ratio of 290 purple to 110 white
is a good fit to a 3/4 to 1/4 ratio.

Chi-square with more than one


degree of freedom

9:3:3:1

phenotype

observed

expected
proportion

expected
number

round
yellow

315

9/16

312.75

round
green

101

3/16

104.25

wrinkled
yellow

108

3/16

104.25

wrinkled
green

32

1/16

34.75

total

556

556

Finding the Expected


Numbers
You are given the observed numbers, and you
determine the expected proportions from a
Punnett square.
To get the expected numbers of offspring, first
add up the observed offspring to get the total
number of offspring. In this case, 315 + 101 +
108 + 32 = 556.
Then multiply total offspring by the expected
proportion:
--expected round yellow = 9/16 * 556 = 312.75
--expected round green = 3/16 * 556 = 104.25
--expected wrinkled yellow = 3/16 * 556 =
104.25
--expected wrinkled green = 1/16 * 556 =
34.75
Note that these add up to 556, the observed total

Calculating the Chi-Square


Value
Use the formula.
X2 = (315 - 312.75)2 / 312.75
+ (101 - 104.25)2 / 104.25
+ (108 - 104.25)2 / 104.25
+ (32 - 34.75)2 / 34.75
= 0.016 + 0.101 + 0.135 + 0.218
= 0.470.
2
(
obs

exp)
2
exp

D.F. and Critical Value


Degrees of freedom is 1 less than the
number of classes of offspring. Here, 4 - 1
= 3 d.f.
For 3 d.f. and p = 0.05, the critical chisquare value is 7.815.
Since the observed chi-square (0.470) is
less than the critical value, we fail to reject
the null hypothesis. We accept Mendels
conclusion that the observed results for a
9/16 : 3/16 : 3/16 : 1/16 ratio.
It should be mentioned that all of Mendels
numbers are unreasonably accurate.

Chi-Square Table

Consider Another example:


For Drosophila melanogaster
Gene affecting wing shape Gene affecting body
color
c+ = Normal wing
e+ = Normal (gray)
c = Curved wing
e = ebony
Note:
The wild-type allele is designated with a + sign
Recessive mutant alleles are designated with
lowercase letters

The Cross:
A cross is made between two true-breeding flies
(c+c+e+e+ and ccee). The flies of the F1 generation
are then allowed to mate with each other to
produce an F2 generation.

The outcome
F1 generation
All offspring have straight wings and gray
bodies
F2 generation
193 straight wings, gray bodies
69 straight wings, ebony bodies
64 curved wings, gray bodies
26 curved wings, ebony bodies
352 total flies

Applying the chi square test


Step 1: Propose a null hypothesis
that allows us to calculate the
expected values based on Mendels
laws
The two traits are independently
assorting

Step 2: Calculate the expected values of the four


phenotypes, based on the hypothesis
According to our hypothesis, there should be a
9:3:3:1 ratio on the F2 generation
Phenotype

Expected
probability

Expected
number

Observed number

straight wings,
gray bodies

9/16

9/16 X 352 = 198

193

straight wings,
ebony bodies

3/16

3/16 X 352 = 66

64

curved wings,
gray bodies

3/16

3/16 X 352 = 66

62

curved wings,
ebony bodies

1/16

1/16 X 352 = 22

24

Step 3: Apply the chi square formula

(O1 E1)2
E1

(193 198)2

198

(O2 E2)2
E2
(69 66)2
66

(O3 E3)2
E3
(64 66)2

0.13 + 0.14 + 0.06 + 0.73


1.06

(O4 E4)2
E4

(26 22)2

Expected
number

Observed
number

198

193

66

64

66

62

22

24

66

22

Step 4: Interpret the chi square value


The calculated chi square value can be used to
obtain probabilities, or P values, from a chi square
table
These probabilities allow us to determine the likelihood
that the observed deviations are due to random chance
alone

Low chi square values indicate a high probability


that the observed deviations could be due to
random chance alone
High chi square values indicate a low probability
that the observed deviations are due to random
chance alone
If the chi square value results in a probability that
is less than 0.05 (ie: less than 5%) it is considered

Step 4: Interpret the chi square value


Before we can use the chi square table, we have
to determine the degrees of freedom (df)
The df is a measure of the number of
categories that are independent of each other
If you know the 3 of the 4 categories you can
deduce the
df = n 1
where n = total number of categories
In our experiment, there are four
phenotypes/categories
Therefore, df = 4 1 = 3
Refer to Table

1.06

Step 4: Interpret the chi square value


With df = 3, the chi square value of 1.06 is slightly
greater than 1.005 (which corresponds to P-value
= 0.80)
P-value = 0.80 means that Chi-square values
equal to or greater than 1.005 are expected to
occur 80% of the time due to random chance
alone; that is, when the null hypothesis is true.
Therefore, it is quite probable that the deviations
between the observed and expected values in this
experiment can be explained by random sampling
error and the null hypothesis is not rejected.
What was the null hypothesis?

If your hypothesis is supported by data


you are claiming that mating is random and
so is segregation and independent
assortment.
If your hypothesis is not supported by data
you are seeing that the deviation between
observed and expected is very far apart
something non-random must be
occurring.

Lets look at an other fruit fly cross

x
Black body,
eyeless

F1: all wild

wild

F1 x F1
5610

1896

1881

622

Analysis of the results


Once the numbers are in, you have
to determine the cross that you were
using.
What is the expected outcome of this
cross?
9/16 wild type: 3/16 normal body
eyeless: 3/16 black body wild eyes:
1/16 black body eyeless.

Now Conduct the Analysis:

To compute the hypothesis value take


10009/16 = 626

Now Conduct the Analysis:

To compute the hypothesis value take


10009/16 = 626

(obs exp)

exp
Using the chi square formula compute
the chi square total for this cross:
(5610 - 5630)2/ 5630 = .07
(1881 - 1877)2/ 1877 = .01
(1896 - 1877 )2/ 1877 = .20
(622 - 626) 2/ 626 = .02
2= .30
How many degrees of freedom?
2

(obs exp)

exp
Using the chi square formula compute
the chi square total for this cross:
(5610 - 5630)2/ 5630 = .07
(1881 - 1877)2/ 1877 = .01
(1896 - 1877 )2/ 1877 = .20
(622 - 626) 2/ 626 = .02
2= .30
How many degrees of freedom? 3
2

CHI-SQUARE DISTRIBUTION TABLE

Accept Hypothesis

Reject
Hypothesis
Probability (p)

Degrees of
Freedom

0.95

0.90

0.80

0.70

0.50

0.30

0.20

0.10

0.05

0.01

0.001

0.004

0.02

0.06

0.15

0.46

1.07

1.64

2.71

3.84

6.64

10.83

0.10

0.21

0.45

0.71

1.39

2.41

3.22

4.60

5.99

9.21

13.82

0.35

0.58

1.01

1.42

2.37

3.66

4.64

6.25

7.82

11.34

16.27

0.71

1.06

1.65

2.20

3.36

4.88

5.99

7.78

9.49

13.38

18.47

1.14

1.61

2.34

3.00

4.35

6.06

7.29

9.24

11.07

15.09

20.52

1.63

2.20

3.07

3.83

5.35

7.23

8.56

10.64

12.59

16.81

22.46

2.17

2.83

3.82

4.67

6.35

8.38

9.80

12.02

14.07

18.48

24.32

2.73

3.49

4.59

5.53

7.34

9.52

11.03

13.36

15.51

20.09

26.12

3.32

4.17

5.38

6.39

8.34

10.66

12.24

14.68

16.92

21.67

27.88

10

3.94

4.86

6.18

7.27

9.34

11.78

13.44

15.99

18.31

23.21

29.59

When reporting chi square data use the


following formula sentence.
With
degrees of freedom, my chi square
value is , which gives me a p value
between % and %, I therefore
my null
hypothesis.

This sentence would go in the reults


section of your formal lab.
Your explanation of the significance of
this data would go in the discussion
section of the formal lab.

Looking this statistic up on the chi


square distribution table tells us
the following:
the P value read off the table
places our chi square number of .
30 close to .95 or 95%
This means that 95% of the time
when our observed data is this
close to our expected data, this
deviation is due to random chance.
We therefore accept our null
hypothesis.

What is the critical value at which


we would reject the null hypothesis?
For three degrees of freedom this
value for our chi square is > 7.815
What if our chi square value was 8.0
with 4 degrees of freedom, do we
accept or reject the null hypothesis?
Accept, since the critical value is
>9.48 with 4 degrees of freedom.

Вам также может понравиться