Академический Документы
Профессиональный Документы
Культура Документы
Objectives
A Good Hypothesis
What are the criteria?
Sample
Statistics
X of sample
Samples must match characteristics of the population. Similarity results in generalizability. Type of sample impacts research quality
Systematic Sample Random Sample Sample of Convenience Volunteerism
Pitfalls of Samples
Sample Bias
Over represent subgroups. Not representative of population.
Sampling Error
Sample group not accurate picture. Reduce by enlarging sample. Larger sample, less error.
Standard Error
Measure of how much sampling error likely to occur when sample is extracted from population Standard deviation of values of the sampling distribution.
Educated guess Reflects the research problem being investigated Determines the techniques for testing the research questions Should be grounded in theory
as a starting point
State of affairs accepted as true in the absence of any other information Until a systematic difference is shown, assume that any difference observed is due to chance Research job is to eliminate chance factors and evaluate other factors that may contribute to group differences
Null Hypotheses
Usually
Research/Alternative Hypotheses
A
statement of a relationship between the variables an inequality. May be nondirectional (two-tailed) May be directional (one-tailed) which is more powerful in research results as it splits the p value in half
a difference between groups but the direction of the difference is not specified
Nondirectional Sentence There is a difference in at-home and pre-school program children on presocial and pre-literacy tests. Nondirectional Symbols
Ha: Q at-home { Q day-care
State the null hypothesis. State the alternative hypothesis Select a level of significance Collect and summarize the sample data. Refer to a criterion for evaluating the sample evidence. Make a decision to keep/reject the null.
There are no differences in the pre, mid, and post test scores of students who are either enrolled in Headstart, daycare, or homecare. Symbols: pre = mid = post
There are no differences in the literacy scores between students who either in Headstart, daycare, or homecare. Symbols: Headstart = daycare= homecare
The pattern of differences of the cell s of pre, mid, and post test literacy scores in the first column (or row) accurately describes the pattern of differences among the cell s in at least one other column (row). Symbols: : Qjk Qjk = Qjk Qjk, for all rows and columns, for all combinations of both j and j, k and k.
There are differences within the pre, mid, and post test scores of students who are either in Headstart, daycare, or homecare. Symbols: Q pre { Q mid { Q post for at least one pair.
There are differences in the the scores between the students in Headstart, daycare, or homecare who either took the pre, mid, or post literacy skills test. Symbols: Q Headstart { Q daycare { Q homecare, for at least one pair
Written: The pattern of differences among the cell s of pre, mid, and post literacy skills scores in the first column (or first row) fails to describe accurately the pattern of differences among the cell s in at least one other column (row). Symbols: Qjk Qjk { Qjk Qjk, for all rows and columns, for all combinations of both j and j, k and k.
Most researchers select a small number such as 0.001, 0.01, or 0.05. The most common choice is 0.05 Otherwise known as alpha level, p=0.05, E=0.05 The significance level serves as a scientific cutoff point that determines what decision will me made concerning the null hypothesis.
Mistakes can occur: Type I Error designates the mistake of rejecting the Ho when the null is actually false. When the level of significance is set at 0.05, this means the chance of a Type I error becomes equal to 1 out of 20.
Type II Errors
Designates a mistake made if Ho is not rejected when the null is actually false.
summary of the sample data will always lead to a single numerical value which is referred to as the calculated value. ( r, t, or f). The computer calculates the probability of the above value in the form of p = ____.
the Null if the p-value is less than the established level of significance. a statistically significant difference was obtained p< 0.05
Retain the Null if the p-value is greater than the established level of significance. H0 was tenable The null was retained. No significant difference was found. The result was not statistically significant.
Rejection Region
H0: Qu H1: Q < 0 H0: Qe0 H1: Q > 0
Reject H0 E 0
Must Be Significantly Below Q= 0
Reject H 0 E Z 0 Z
t-Test: WUnknown
Assumptions
Population is normally distributed If not normal, only slightly skewed & a large sample taken (Central limit theorem applies)
Parametric t test
test procedure
X Q t! S n
Degrees of Freedom
# in sample - number of parameters that must be estimated before test statistic can be computed. For a single sample t-test, we must first estimate the mean before we can estima te the standard deviation. f Once the mean is estimated, n-1 of the values are left since we know that the nth value is equal to
n
nx x i
i !1
368 gm.
The null hypothesis for the proportion also implies we know the variance, since the variance is just P times (1-P). This is a good approximation when the sample size is large. If the sample size is small, we could use the binomial distribution to compute the exact p value that a sample of size n would yield a sample proportion ps given the population proportion P. Using the normal approximation is much easier.
You want to know whether these 2 sets of measurements genuinely differ. The issue here is that you need to rule out the possibility of the results being random noise.
Involves the creation of two competing explanations for the data recorded.
Idea 1:These are pattern-less random data. Any observed patterns are due to chance. This is the null hypothesis H0 Idea 2: There is a defined pattern in the data. This is the alternative hypothesis H1
Without the statement of the competing hypotheses, no meaning test can be run.
Example
You conduct a questionnaire survey of homes in the Heathrow flight path, and also a control population of homes in South west London. Responses to the question How intrusive is plane noise in your daily life are tabulated:
Noise complaints 1= no complaint, 5 = very unhappy Homes near airport Control site 5 3 4 2 4 4 3 1 5 2 4 1 5
Now we assess how likely it is that this pattern could occur by chance:
This
is done by performing a calculation. Dont worry yet about what the calculation entails. What matters is that the calculation gives an answer (a test statistic) whose likelihood can be looked up in tables. Thus by means of this tool the test statistic - we can work out an estimate of the probability that the
test statistic generates a probability a number for 0 to 1, which is the probability of H0 being true. If p = 0, H0 is certainly false. (Actually this is over-simple, but a good approximation) If p is large, say p = 0.8, H0 must be accepted as true. But how about p = 0.1, p = 0.01?
Significance
We have to define a threshold, a boundary, and say that if p is below this threshold H0 is rejected otherwise H1 is accepted. This boundary is called the significance level. By convention it is set at p=0.05 (1:20), but you can chose any other number - as long as you specify it in the write-up of your analyses. WARNING!! This means that if you analyse 100 sets of random data, the expectance (log-term average) is that 5 will generate a significant test.
The procedure:
Set up H0, H1. Decide significance level p=0.05
Data
5 4 4 3 5 4 5 3 2 4 1 2 1
Accept H0
The Mann-Whitney U test is a non-parametric test which examines whether 2 columns of data could have come from the same population (ie should be the same) It generates a test statistic called U (no idea why its U). By hand we look U up in tables; PCs give you an exact probability. It requires 2 sets of data - these need not be paired, nor need they be normally distributed, nor need there be equal numbers in each set.
How to do it
1: rank all data into
ascending order, then re-code the data set replacing raw data with ranks.
Data
5 4 4 3 5 4 5 3 2 4 1 2 1
Data
5 4 4 3 5 4 5 #13 #10 #9 #6 #12 #8 #11 3 2 4 1 2 1 #5 #4 #7 #2 #3 #1
Data
5 4 4 3 5 4 5 #13 = 12 #10 = 8.5 #9 = 8.5 #6 = 5.5 #12 = 12 #8 = 8.5 #11 = 12 3 2 4 1 2 1 #5 #4 #7 #2 #3 #1 = = = = = = 5.5 3.5 8.5 1.5 3.5 1.5
up ranks for each column; call these rx and ry (Optional but a good check:
rx + ry = n2/2 + n/2, or you have an error)
Calculate
Ux = NxNy + Nx(Nx+1)/2 - Rx Uy = NxNy + Ny(Ny+1)/2 - Ry
take the SMALLER of these 2 values and look up in tables. If U is LESS than the critical value, reject H0 NB This test is unique in one feature: Here low values of the
In this case:
Data
5 4 4 3 5 4 5 #13 = 12 #10 = 8.5 #9 = 8.5 #6 = 5.5 #12 = 12 #8 = 8.5 #11 = 12 ___ rx=67 3 2 4 1 2 1 #5 #4 #7 #2 #3 #1 = = = = = = 5.5 3.5 8.5 1.5 3.5 1.5
Ux = 6*7 + 7*8/2 - 67 = 3 Uy = 6*7 + 6*7/2 - 24 = 39 Lowest U value is 3. Critical value of U (7,6) = 4 at p = 0.01. Calculated U is < tabulated U so reject H0. At p = 0.01 these two sets of data differ.
___ ry=24
The
Distribution A
Distribution B
Wilcoxon
first, combine the two samples and rank order all the observations. smallest number has rank 1, largest number has rank N (= sum of n1 and n2). separate samples and add up the ranks for the smaller sample. (If n1 = n2, choose either one.) test statistic : rank sum T for smaller Wilcoxon sample.
Wilcoxon
Wilcoxon
Rejection region:
Sample taken from Population A being smaller than sample for Population B) reject H0 if TA TU or TA TL
Wilcoxon
Wilcoxon Test
Z = TA n1(n1 + n2 + 1) 2 n1n2(n1 + n2 + 1) 12
Wilcoxon
region: One-tailed
Z
> Z
10
Note:
Example 1
are small samples, and they are independent (random samples of Cajun and Creole dishes). Therefore, we have to begin with the test of equality of variances.
These
Wilcoxon
statistic: F = S12 S22 region: F > F/2 = F(6,6,.025) = 5.82 or F < (1/5.82) = .172
Wilcoxon
148432.14 1055833.33
= 7.11
test:
Wilcoxon
39)
(Note:
Wilcoxon
4100
70
Creole 4.5 13.53100 3 4700 6.5 2700 2 3500 4.5 2000 1 3100 1550 35
Wilcoxon
check:
+ 35 = 105 = (14)(15) 2
Wilcoxon
39)
= 70 > 66
Therefore,
Wilcoxon
statistic: F = S12 S22 region: F > F/2 = F(7,8,.025) = 4.53 or F < (1/4.90) = .204
Wilcoxon
4.316 .46
= 9.38
H0 do Wilcoxon
Wilcoxon
16 1 5 15 2 8 14 17 78
3 10 12 4 6.5 11 6.5 13 9 75
Wilcoxon
= 78 < TU = 90
Therefore,
do not reject H0 no evidence that mean distance in females is less than that in males.
Wilcoxon
statistic: F = S12 S22 region: F > F/2 = F(5,5,.025) = 7.15 or F < (1/7.15) = .140
Wilcoxon
(7.563)2 (2.04)2
= 57.20 4.16
= 13.74 H0 do Wilcoxon
Wilcoxon
Reject
TH + TM = 78 (12)(13) = 78 2
The Kruskal-Wallis (KW) Test for Comparing Populations with Unknown Distributions
A nonparametric test for comparing population medians by Kruskal and Wallis The KW procedure tests the null hypothesis that k samples from possibly different populations actually originate from similar populations, at least as far as their central tendencies, or medians, are concerned. The test assumes that the variables under consideration have underlying continuous distributions. In what follows assume we have k samples, and the sample size of the i-th sample is ni, i = 1, 2, . . ., k.
next step is to compute the sum of the ranks for each of the original samples. The KW test determines whether these sums of ranks are so different by sample that they are not likely to have all come from the same population.
It can be shown that if the k samples come from the same population, that is, if the null hypothesis is true, then the test statistic, H, used in the KW procedure is distributed approximately as a chi-square statistic with df = k - 1, provided that the sample sizes of the k samples are not too small (say, ni>4, for all i). H is defined as follows:
where k = number of samples (groups) ni = number of observations for the i-th sample or group N = total number of observations (sum of all the ni) Ri = sum of ranks for group i
Non-parametric Tests
Parametric Chi-Square
vs Non-parametric
1 way 2 way
Parametric Tests
Data
approximately normally distributed. Dependent variables at interval level. Sampling random t - tests ANOVA
Non-parametric Tests
Do
Powerful -- probability of rejecting the null hypothesis correctly is lower. So use Parametric Tests if the data meets those requirements.
HO
= observed frequencies are not different from the expected frequencies. Research hypothesis: They are different.
( fo fe ) !7 fe
If our calculated value of chi square is less than the table value, accept or retain Ho If our calculated chi square is greater than the table value, reject Ho as with t-tests and ANOVA all work on the same principle for acceptance and rejection of the null hypothesis
cross-tabulations (= contingency tables) from Chapter 2. Are the differences in responses of two groups statistically significantly different? One-way = observed vs expected Two-way = one set of observed frequencies vs another set.
between frequencies (rather than scores as in t or F tests). So, null hypothesis is that the two or more populations do not differ with respect to frequency of occurrence.
rather
hypothesis: The relative frequency [or percentage] of liberals who are permissive is the same as the relative frequency of conservatives who are permissive. Categories (independent variable) are liberals and conservatives. Dependent variable being measured is permissiveness.
we had 20 respondents in each column and each row, our expected values in this cross-tabulation would be 10 cases per cell. Note that both rows and columns are nominal data -- which could not be handled by t test or ANOVA. Here the numbers are frequencies, not an interval variable.
most examples do not have equal row and column totals, so it is harder to figure out the expected frequencies.
frequencies would we see if there were no difference between groups (if the null hypothesis were true)? If 25 out of 40 respondents(62.5%) were permissive, and there were no difference between liberals and conservatives, 62.5% of each would be permissive.
get the expected frequencies for each cell by multiplying the row marginal total by the column marginal total and dividing the result by N. Well put the expected values in parentheses.
the chi square statistic, from this data is (15-12.5)squared / 12.5 PLUS the same values for all the other cells = .5 + .5 + .83 + .83 = 2.66
= (r-1) (c-1) , where r = rows, c =columns so df = (2-1)(2-1) = 1 Table C, = .05, chi-sq = 3.84
From
Compare:
Calculate 2.66 is less than table value, so we retain the null hypothesis.