Вы находитесь на странице: 1из 20

Welcome to the 2nd week of IST622. This week focuses on inferential statistics.

Specifically, you will learn basic concepts of inferential statistics.

1
Descriptive statistics provide basic measures of a distribution of scores while
inferential statistics allow inferences to a larger population from the sample.

Estimating population parameters and hypothesis testing are two basic ways that
inferential statistics are used. Any type of generalization or prediction about one or
more groups (populations) based upon parts of the groups (sample) is called an
inference. Remember, the groups can be people, objects, or events. Inferential
statistics techniques are not as familiar as descriptive techniques to most people.
The most common techniques used in instructional technology include t-test,
analysis of variance (ANOVA), and Chi-square.

2
This table above helps you understand how the four concepts (population, sample,
descriptive statistics and inferential statistics) are related.

It is important to know that all population statistics are descriptive. Technically, these
statistics are called parameters because they describe certain characteristics of an
entire population. If you already know the parameters of a population, there is no
need to infer anything, and thus, all parameters are descriptive. In reality, it is
almost impossible or very difficult to obtain information on an entire population.

3
Often, researchers are searching for differences between people, objects, or events
and for explanations for the differences they find. This is often done by testing
hypotheses.

Hypothesis testing is necessary because statistical values obtained from a sample


may be close to the actual population value, but they will not be exactly the same.
This error is called sampling error. When the error is due to chance or luck, the
laws of probability can be used to assess the possible magnitude of the error.

4
An example:
A research data indicated that 73% of the hourly workers in company A and 68% of
the hourly workers in company B preferred working 4 days for 10 hours instead
of 5 days for 8 hours per week. This suggests that a higher percentage of
workers in company A prefer the 4 day 10 hour option 73% vs. 68%. Given
that both percentages were derived from samples, it is possible that the
difference obtained is due to sampling errors. In other words, it is possible that
the population percentage for company A workers and the population percentage
for company B workers are identical and the difference we found in sampling the
two groups of workers is due to chance alone. This possibility is known as the
Null hypothesis.

Besides the above statement, the null hypothesis can also be stated as:
1. There is no true difference between the percentages.
2. The observed difference between the percentages was created by sampling
error.

5
Researchers or analysts will often have their own professional judgment or
hypothesis either based on rigorous literature review or experience. It is usually
inconsistent with the null hypothesis. It is called a research hypothesis or alternative
hypothesis.

Returning to the example on the previous slide, a needs analyst may believe that
there will be differences in the responses of hourly workers in company A and B on
this matter. But the analyst is not willing to speculate in advance which group of
workers will be more or less in favor of a change. In research terminology this is
called non-directional research hypothesis.

On the other hand, what if the analyst knows in advance that company B has more
female hourly workers with school-age children than company B? And these
workers have previously expressed a desire to be home with their children after
school, and working eight hours per day accommodates this need. The analyst may
want to hypothesize that fewer hourly workers in company B will want to change to
a 4-day, 10-hour shift working day. In this case, the analyst would state a directional
hypothesis. If the research hypothesis is directional, your null hypothesis in this
case is also directional. H1 and H0 are always mutually exclusive. In this second
situation, the null hypothesis would be Ho: P1<=P2. Therefore, null hypothesis can
be directional as well.

6
Statistical significance is the degree of risk that you are willing to take that you will
reject a null hypothesis when it is actually true. It refers to the value or measure of a
variable when it is larger or smaller than would be expected by chance alone. It
comes into play when the researcher is making inferences about population
parameters from sample statistics.

The level of significance, symbolized by the Greek letter (alpha), has certain
conventional values associated with it, such as .01 and .05. If the level of
significance is .01, it means that on any one test of the null hypothesis, there is a
1% chance you will reject the null hypothesis when it is true. If the level of
significance is .05, it means that on any one test of the null hypothesis, there is a
5% chance you will reject it when the null is true. The chance of getting wrong
conclusion is less than 5% is a conventional level of significance for most social
science researches. For medical tests, the level of significance is usually very small
such as .005 or .001.

7
Statistical significance is usually represented as p or p-value. If you see p<.02, it
reads as the probability of obtaining that outcome is less than 2% and is often
expressed as significant at the .05 level because the conventional values are often
expressed as .01, .05, .10 etc.

Level of significance, alpha, is what the researcher or analyst would determine. And
the p-value is often what the statistics analysis data returns. If a significance test
gives a p-value lower than the pre-determined -level, the null hypothesis is
rejected and we can conclude that the test result is statistically significant.

In Excel, you may see p-values are returned using scientific notation such as 3.12E
04. This really means 0.000312 . Therefore, pay attention to scientific notation
when you read test result in Excel.

8
For example, suppose an analyst took a large sample of hourly workers from a
company in California and one in Florida. In comparing the data, the analyst finds
that the average age of California workers is 33 years and the Florida workers is 36
years. If the samples were large and representative, this 3-year age difference
would be unlikely to be due to chance or sampling error alone. It would be
statistically significant. However, it would be hard to find a reason that a 3-year
difference in age would have any practical significance about hourly workers in the
two states. Therefore, a statistically significant finding may not be practically
significance. On the other hand, if a statistical finding is not statistically significant, it
cannot be practically significant. We often compute Effect Size to help interpret
whether a finding is practically significant or not. However, we will not have time to
cover this part of computation in this class. You should learn about Effect Size on
your own if you are interested in knowing more about it.

9
The variance of a sampling distribution depends on sample size, or more
specifically degree of freedom, abbreviated df . The term comes from physical
sciences, where it refers to the number of directions in which an object is free to
move. In statistics, the term degree of freedom refers to the number of scores
whose value are free to vary. Another way of thinking about the restriction principle
behind degrees of freedom is to imagine contingencies. For example, imagine you
have five numbers (a, b, c, d and e) that must add up to a total of m; you are free to
choose the first four numbers at random, but the fifth must be chosen (you have no
choice over what number this should be) so that it makes the total equal to m, thus
your degree of freedom is four.

10
Since sample size is a critical factor to consider in research studies, we always
report the df for hypothesis testing. And it influences the critical value (cut off score)
for determining whether or not to reject a hypothesis based on the testing result.
Critical value is the value that a test statistic must exceed in order for the null
hypothesis to be rejected. For example, the critical value of t (with 12 degree of
freedom using .05 significant level) is 2.18. This means that for the probability value
to be less or equal to 0.05, the absolute value of the t statistic must be 2.18 or
greater. Again, we are talking about absolute values only (not considering the + or
signs.)

11
You often hear Type I error and Type II error in testing statistical significance. As
shown on this table, they are about whether your conclusion (inference) from
sample data to the population is correctly made or not. Ideally, you want to minimize
both Type I and Type II errors, but to do so is not always easy or under your control.
You have control over the Type I error because you actually set the level of
significance. Type II errors are not as directly controlled but instead are related to
factors such as sample size. Type II errors are particularly sensitive to the number
of subjects in a sample and, as that number increases, Type II error decreases. In
other words, as the sample characteristics more closely match that of the population
(often achieved by increasing the sample size), the likelihood that you will accept a
false null hypothesis also decreases.

12
1. State the null hypothesis
e.g., Ho: 1 = 2 or Ho: 1 >= 2
2. Establish significance level
e.g., p = .05
e.g., p = .01
3. Select appropriate test statistic
4. Compute test statistic (we use Excel to run this part)
5. Determine value needed to reject null (critical value), which depends on
a. Level of significance chosen (e.g., alpha = 0.5)
b. Degrees of freedom (based on sample size)
6. Compare obtained value to critical value
7. If obtained value > critical value, reject null
8. If obtained value critical value, accept null

13
14
Chi-Square is used to test differences among sample frequency (nominal) data. The
most frequent use of the Chi-Square test is to determine if there are statistically
significant differences between observed frequencies and expected (or hypothetical)
frequencies for two or more variables. The frequency data are typically presented in
a contingency table or cross-tabulation. The larger the observed discrepancy is in
comparison to the expected frequency, the larger the Chi-Square statistics and the
more likely the observed differences are statistically significant.

Chi-Square test is used with categorical data represented as frequency counts or


percentages. The researcher collects frequency data indicating the number of
subjects in each category and arranges the data in a table format. The Chi-Square
technique enables the researcher to answer two types of questions. First, an
researcher can test how one distribution might differ from a predetermined
theoretical distribution. This is also called testing goodness of fit. Second, the
analyst can compare responses obtained from among or between various groups.
This is called testing for independence. Both types of situations are tested using
the same procedure.

15
16
17
If you use the Chi-Square formula to calculate, the Chi-Square is 34.32. The degree
of freedom is 3 (the number of categories minus one, df=n-1, n is the number of
categories in this case).

18
The calculated value of the Chi-Square must be larger than or equal to the critical
value for significance. Since 34.32 is larger than 7.81, we conclude the result is
statistically significant. Based on the finding shown at the bottom of this slide, we
report, Because the chi-square value (34.32) for the observed car preference
differences is greater than the critical value (7.81, df=3), we can conclude that the
differences of observed and expected car preference are statistically significant at
the 0.05 level.

Chi-Square critical value table can be found on Page 537 on the Witte Book or at:
http://www.itl.nist.gov/div898/handbook/eda/section3/eda3674.htm

19
Do not worry if you could not understand all the concepts right away. Ask questions
if there is any. There is no dumb questions at all, and do not hesitate to raise your
questions. Things should get clear when you start to do more practice in the next
two weeks. But plan to spend 3-5 minutes on each side on your own trying to
understand each concept and procedure.

20

Вам также может понравиться