Академический Документы
Профессиональный Документы
Культура Документы
of the population distribution(s) from which one's data are drawn. Parametric tests can provide
trustworthy results with distributions that are skewed and normal. Parametric tests can provide
trustworthy results when the groups have different amounts of variability. Parametric tests have
greater statistical power.
The Chi-square test is one of the simplest and most widely used non-parametric tests in statistics.
The symbol (χ) is the Greek letter Chi. This test gives the magnitude of discrepancy between theory
and observation. In general, “The test we use to measure the differences between what is observed
and what is expected according to an assumed hypothesis is called the chi-square test.” This test (as
a non-parametric test) is based on frequencies and not on the parameters like mean and standard
deviation. It is used for testing the hypothesis and is not useful for estimation. Also written as χ2
test, is any statistical hypothesis test where the sampling distribution of the test statistic is a chi-
squared distribution when the null hypothesis is true. Used to determine whether there is a
significant difference between the expected frequencies and the observed frequencies in one or
more categories.
Chi Square can be used in:
1. Goodness of Fit: This test enables us to see how well does the assumed theoretical distribution
(such as Binomial distribution, Poisson distribution or Normal distribution) fit to the observed data.
Each individual in the sample is classified into one category on the scale of measurement. The data,
called observed frequencies, simply count how many individuals from the sample are in each
category. One way in which the chi square goodness of fit test can be used is to examine how closely
a sample matches a population
EXAMPLE:
Question- Suppose the Penn State student population is 20% PA resident and 80% non-PA resident.
Then, if a sample of 100 students yields 16 PA resident and 84 non-PA resident, how 'good' do the
data 'fit' the assumed probability model of 20% PA resident and 80% non-PA resident?
Solution:-We can use the chi-square goodness-of-fit statistic to test the hypotheses statements:
χ2= (16−20)2/20+⋯+(84−80)2/80=1
The critical value of χ2 with 1 degree of freedom is 3.84. Since 1 < 3.84, we cannot reject the null
hypothesis. There is not enough evidence to conclude that the data don't fit the assumed probability
model at 5% level of significance.
In other words, the students that were randomly selected in this example did resemble the
probability distribution that was specified.
2. Test of Independence of Attributes: Enables us to explain whether or not two attributes are
associated. For instance, we may be interested in knowing whether a new medicine is effective in
controlling fever or not. In such a situation, we proceed with the null hypothesis that the two
attributes (viz., new medicine and control of fever) are independent which means that new medicine
is not effective in controlling fever.
EXAMPLE:
Is gender independent of education level? A random sample of 395 people were surveyed and each
person was asked to report the highest education level they obtained. The data that resulted from
the survey is summarized in the following table:
Question:- Are gender and education level dependent at 5% level of significance? In other words,
given the data collected above, is there a relationship between the gender of an individual and the
level of education that they have obtained?
Total
High School Bachelors Masters Ph.D.
Since 8.006 > 7.815, therefore we reject the null hypothesis and conclude that the education level
depends on gender at a 5% level of significance.