You are on page 1of 4

Using the Chi-Square Statistical Test

Adapted from Carter, J. G. and M. E. Carter. 1999. Appendix B: Chi-Square Test. In Investigating Biology, 3rd ed. The Benjamin/Cummings Publishing Company, Inc., Menlo Park, CA.

Chi-square is a statistical test commonly used to compare observed data with data we would expect to obtain according to a specific scientific hypothesis. Were the deviations (differences between observed and expected) the result of chance, or were they due to other factors? How much deviation can occur before the investigator must conclude that something other than chance is at work, causing the observed to differ from the expected? The chi-square test can help in making that decision. The chi-square test is always testing what scientists call the null hypothesis, which states that there is no significant difference between the expected and the observed result. The formula for calculating chi-square ( 2) is:
2

(o e) 2 . e

That is, chi-square is the sum of the squared difference between observed (o) and expected (e) data (or the deviation, d), divided by the expected data in all possible categories. For example, suppose you would like to determine if horned lizards prefer to eat particular ants. You ask the question, Do horned lizards eat one ant species more frequently than another? After collecting a bunch of hungry horned lizard and thousands of individuals each of two ant species, harvester ants and thatching ants, you devise the following hypotheses: Ha: When exposed to equal numbers of harvester ants and thatching ants, horned lizards prefer one ant species over another. (The ratio of harvester ants eaten to thatching ants eaten is not 1:1, or there is a significant difference in the numbers of each ant species eaten). When exposed to equal numbers of harvester ants and thatching ants, horned lizards do not prefer one ant species over another. (The ratio of harvester ants eaten to thatching ants eaten is 1:1, or there is no significant difference in the numbers of each ant species eaten.)

Ho:

If each lizard does not eat significantly more of one ant species than the other species on average, then you fail to reject the null hypothesis because the ratio is 1:1. If each lizard eats significantly more of one ant species than another on average (the ratio is not 1:1), then you may reject the null, which provides support for the research hypothesis. Remember, you always test the null hypothesis; therefore, youd expect the ratio of harvester ants eaten to thatching ants eaten to be 1:1 if the null is true. Lets say you find that on average each lizard eats 106 harvester ants and 73 thatching ants. To determine if the null hypothesis is rejected or not rejected, a 2 value is computed. To calculate 2, first determine the number expected in each category. If the ratio is 1:1 and the total number of ants observed eaten is 179 (106 harvester ants plus 73 thatching ants), then the expected numerical values should be 89.5 harvester ants and 89.5 thatching ants.

Then calculate x2 using the formula shown above. See Table B.1. Note that we get a value of 6.08 for 2. But what does this number mean? Here's how to interpret the 2 value: 1. Determine degrees of freedom (df). Degrees of freedom can be calculated as the number of categories in the problem minus 1. For example, if we had four categories, then the degrees of freedom would be 3. In our example, there are two categories (harvester ants and thatching ants); therefore, there is 1 degree of freedom. Determine a relative standard to serve as the basis for rejecting the hypothesis. The relative standard commonly used in biological research is p < 0.05. The p-value is the probability of rejecting the null hypothesis (that there is no difference between observed and expected), when the null hypothesis is true. When we pick p < 0.05, we state that there is less than a 5% chance of error of stating that there is a difference when, in fact, there is no significant difference. Refer to a chi-square distribution table (Table B.2). Using the appropriate degrees of freedom, locate the value corresponding to the p value of 0.05, the error probability that you selected. If your 2 value is greater than the value corresponding to the p value of 0.05, then you reject the null hypothesis. You conclude that the observed numbers are significantly different from the expected. In this example your calculated 2 value, 6.08, is larger than the table 2 value of 3.84 (df = 1, p = 0.05). Therefore, you may reject the null hypothesis which provides support for the research hypothesis. Therefore, when exposed to equal numbers of harvester ants and thatching ants, horned lizards eat significantly more of one species. Differences between the mean numbers of ants eaten cannot be entirely attributed to chance.

2.

3.

Step-by-Step Procedure for Testing Your Null Hypothesis and Calculating Chi-Square 1. 2. 3. 4. 5. State the research hypothesis and the null hypothesis that is being tested. Gather the data by conducting the relevant experiment. Determine the expected numbers for each observational class. Remember to use numbers, not percentages. Calculate x2 using the formula. Complete all calculations to two decimal places. Use the chi-square distribution table to determine the significance of the value. a. Determine the degrees of freedom, one less than the number of categories. Locate that value in the appropriate column. b. Locate the 2 value for your significance level (p = 0.05). c. Compare this 2 value (from the table) with your calculated 2. State your conclusions in terms of your hypotheses. a. If your calculated 2 value is greater than the 2 value for your particular degrees of freedom and p value (0.05), then reject the null hypothesis of no difference between expected and observed results. You can conclude that there is a significant difference between what you observed and the results you expected if your null hypothesis is true. b. If your calculated 2 value is less than the 2 value for your particular degrees of freedom and p value (0.05), then fail to reject the null hypothesis of no difference between observed and expected results. You can conclude that there does not seem to be a significant difference between what you observed and the results you expected if your null hypothesis was true. You can conclude that any differences between your observed results and the expected results can be attributed to chance or sampling error. (Note: It is incorrect to say that you "accept" the null hypothesis. Statisticians either "reject" or "fail to reject" the null hypothesis.)

6.

Table B.1 Calculating Chi-Square observed (o) expected (e) deviation (o-e) deviation2 (d2) d2/e
2

106 89.5 16.5 272.25 3.04 (o e) 2 e d2 e 6.08

73 89.5 -16.5 272.25 3.04

Table B.2 Chi-Square Distribution Table


degrees of freedom (df) 1 2 3 4 5 6 7 8 9 10 critical values for the -test ( =0.05)

3.84 5.99 7.82 9.49 11.07 12.59 14.07 15.51 16.92 18.31

Source;

R. A. Fisher and F Yates, Statistical Tables for Biological, Agricultural, and Medical Research, 6th ed., Table lV, Longman Group UK Ltd., 1974.