Академический Документы
Профессиональный Документы
Культура Документы
Macatangay
BSN2-2
Outliers
An outlier is a data value which is too extreme to belong in the distribution of interest. Let’s
suppose in our example that the assembly machine ran out of a particular component,
resulting in a laptop that was assembled at a much lower weight. This is a condition that is
outside of our question of interest, and therefore we can remove that observation prior to
conducting the analysis. However, just because a value is extreme does not make it an
outlier. Let’s suppose that our laptop assembly machine occasionally produces laptops which
weigh significantly more or less than five pounds, our target value. In this case, these extreme
values are absolutely essential to the question we are asking and should not be removed.
Box-plots are useful for visualizing the variability in a sample, as well as locating any outliers.
The boxplot on the left shows a sample with no outliers. The boxplot on the right shows a
sample with one outlier.
Figure 2. Boxplots of a variable without outliers (left) and with an outlier (right).
Procedure
The procedure for a one sample t-test can be summed up in four steps. The symbols to be
used are defined below:
Y=Random sample
y
i
=
The i
th
observation in Y
n=
The sample size
m
0
=
The hypothesized value
y
¯
¯
¯
=
The sample mean
σ
^
=
The sample standard deviation
T=
The critical value of a t-distribution with (n−1
) degrees of freedom
t=
The t-statistic (t-test statistic) for a one sample t-test
p=
The p
-value (probability value) for the t-statistic.
The four steps are listed below:
1. Calculate the sample mean.
y
¯
¯
¯
=y
1
+y
2+⋯+y
n
n
2. Calculate the sample standard deviation.
The Independent Samples t Test compares the means of two independent groups in order to
determine whether there is statistical evidence that the associated population means are
significantly different. The Independent Samples t Test is a parametric test.
This test is also known as:
Independent t Test
Independent Measures t Test
Independent Two-sample t Test
Student t Test
Two-Sample t Test
Uncorrelated Scores t Test
Unpaired t Test
Unrelated t Test
The variables used in this test are known as:
Dependent variable, or test variable
Independent variable, or grouping variable
A coffee chain has two locations, one in Queens and one in NYC. The coffee chain owner
wants to make sure that all lattes are consistent between the two locations.
A sample of 20 lattes is collected from the Queens store and is determined to have a sample
mean of 4.1 oz. of espresso per latte with a sample standard deviation of .12 oz.
A sample of 22 lattes is then taken from the NYC store and is determined to have a sample
mean of 4.2 oz. if espresso per latte and a standard deviation of .11 oz.
Use alpha = .05 and run an independent samples t-test to determine if there is a significant
difference between the amount of espresso between the two coffee shop lattes.
Step 1: What is Ho and Ha?
Ho: mean espresso in a latte in Queens = mean espresso in a latte in NYC
Ho: mean espresso in a latte in Queens ≠ mean espresso in a latte in NYC
Should you use t-test or z-test and why?
This is best as a t-test because we are comparing two sample means.
Notice that this is a TWO-TAILED test. We will determine if there is a significant difference.
Our sample size of the Queens sample is n1 = 20 and the std dev s1 is .12
Our sample size of the NYC sample is n2 = 22 and the std dev s2 is .11
Step 2: Calculating the t-test statistic for an independent samples t-test
NOTE: There are three types of t-tests. There is the one sample t-test that compares a single
sample to a known population value. There is an independent samples t-test (this example)
that compares two samples to each other. There is a paired data (also called correlated data)
t-test that compares two samples from data that is related (like pretest score and post test
score).
t -test = (sample mean 1 – sample mean 2)/[ sqrt ( s1^2/n1 + s2^2/n2) ]
Recall:
sample mean1 = 4.1, s1 = .12, n1 = 20
sample mean2 = 4.2 , s2 = .11, n2 = 22
NOTE: s1^2 = (.12) ^2 = .12 * .12 = .0144
NOTE: s2^2 = (.11) ^2 = .11 * .11 = .0121
T -test = (sample mean 1 – sample mean 2)/[ sqrt ( s1^2/n1 + s2^2/n2) ]
T-test = (4.1 – 4.2) / [sqrt ((.12)^2 / 20 + (.11)^2/22) ] =
T-test = (4.1 – 4.2) / [sqrt (.0144 / 20 + .0121/22) ] =
T-test = (4.1 – 4.2) / [sqrt (.00072 + .00055)] =
T-test = (4.1 – 4.2) / [sqrt (.00127)] =
T-test = (4.1 – 4.2) / [.03564]
T-test = (– 0.1) / [.03564]
T-test = – 2.8058
NOTE: You must use the order of operations when solving any expression.
Step 3: Determine if this value is in a rejection region (reject Ho) or not (do not reject Ho)
Next, using any t-table (these tables are always on the internet) we can get the critical values
(tc) for the two tailed test.
Our degrees of freedom for this one sample t-test are:
df = n1 + n2 – 2 = 20 + 22 – 2 = 40
Our alpha value is .05
Our test is two-tailed
Therefore, using any t-table, the two “critical values” that represent the cut-off points for
rejection are:
tc = +/- 2.042
ttableExample_05_30df_2tailed
This tells is that if our t-test result (which in this case is 13.6) is either bigger than 2.042 or
less than -2.042 then we CAN reject the null because we ARE in the rejection region.
Result:
-2.8058 < – 2.042
Therefore, we must Reject Ho
tcurve_indepT_example
Step 4: Understanding and writing the conclusion – what does this all mean
Recall that Ho says that there is no sig diff between the lattes in Queens and the lattes in
NYC.
However, in this case, we REJECT Ho. Our t-test result WAS in the rejection region because
the value was smaller than the cut-off of -2.042. In other words, we do not agree with Ho. We
do not think that Ho is correct (with a .05 error margin).
Because we reject Ho (do not choose Ho) we then choose Ha.
Ha tells us that there IS A SIG DIFF between the amount of espresso in the Queens lattes
and the amount in the NYC lattes.
2. Who do you think has a better sense of humor-women or men?
Researchers asked 10 men and 10 women in their study to categorize 30 cartoons as either
“funny” or “funny”. Each participant received a score that represents her or his percentage of
cartoons found to be “funny”. Below are fictional data for 9 people; these fictional data have
approximately the same means as were reported in the original study (Azim, Mobbs, Jo,
Menon and Reiss, 2005).
Percentage of cartoons labelled as “funny”.
Women: 84,97,58,90
Men: 88, 90,52,97,86
How can we conduct a independent-sample t test for this scenario, using a two-tailed test and
a significant level of 0.05.
Step 1: Assumptions
Population 1: Women exposed to humorous cartoons.
Population 2: Men exposed to humorous cartoons.
The comparison distribution will be a distribution of mean differences based on the null
hypothesis.
The hypothesis test will be a independent-samples t test because we have two samples
composed of different groups of participants.
Step 2: State the null and alternative hypotheses.
Remember, hypotheses are always about populations, not about our specific samples.
Null hypothesis: On average, women categorize the same percentage of cartoons as “funny”
as men. \[H_0: \mu_1 = \mu_2\]
Alternative hypothesis: On average, women categorize a different percentage of cartoons as
“funny” as compared with men.
\ [H_a: \mu_1 \neq \mu_2\]
Step 3: Determine the characteristics of the comparison distribution
We calculate the standard deviation for men and women separately. Remember that the
variance for the average of a sample is:
\[s^2_\bar{X} = \dfrac{s^2}{n}\].
For women:
Mean is
## [1] 82.25
And variance:
## [1] 72.4
For men:
Mean is
## [1] 82.6
And variance:
## [1] 61.96
Step 4: Determine the significance level
The exercise already did that for us.
\[\alpha = 0.05\]
Step 5: Calculate Test Statistic
\[t = \dfrac{\bar{x_1} - \bar{x_2} - \mu_1 - \mu_2}{\sqrt{s^2_1 + s^2_2}} = \dfrac{82.25 - 82.6 -
0}{\sqrt{72.39583 + 61.96}}\].
Hence, t statistic is
## [1] -0.0302
Step 6.1: Conclude (Statiscal way)
Because this is a two-sided test (alternative hypothesis: not equal), p-value is going to be:
\[P (|T| > |-0.03019533|) = 2P (T < -0.03019533)\].
Moreover, the degrees of freedom are
\[df = (N_1 -1) + (N_2 - 1) = 4 - 1 + 5 -1 = 7\].
Therefore, p-value is:
## [1] 0.9768
So, p-value > 0.05. Therefore, we do not reject the null hypothesis.
Step 6.2: Conclude (English)
There is not enough (or significant) evidence to conclude that there either men or women are
more likely than the opposite gender, on average, to find cartoons funny.
3. Calculate an independent samples t test for the following data sets:
Data set A: 1,2,2,3,3,4,4,5,5,6
Data set B: 1,2,4,5,5,5,6,6,7,9
Step 1: Sum the two groups:
A: 1 + 2 + 2 + 3 + 3 + 4 + 4 + 5 + 5 + 6 = 35
B: 1 + 2 + 4 + 5 + 5 + 5 + 6 + 6 + 7 + 9 = 50
Step 2: Square the sums from Step 1:
352 = 1225
492 = 2500
set these numbers aside for a moment.
Step 3: Calculate the means for the two groups:
A: (1 + 2 + 2 + 3 + 3 + 4 + 4 + 5 + 5 + 6)/10 = 35/10 = 3.5
B: (1 + 2 + 4 + 5 + 5 + 5 + 6 + 6 + 7 + 9) = 50/10 = 5
Set these numbers aside for a moment.
Step 4: Square the individual scores and then add them up:
A: 11 + 22 + 22 + 33 + 33 + 44 + 44 + 55 + 55 + 66 = 145
B: 12 + 22 + 44 + 55 + 55 + 55 + 66 + 66 + 77 + 99 = 298
Set these numbers aside for a moment.
Step 5: Insert your numbers into the following formula and solve:
(ΣA)2: Sum of data set A, squared (Step 2).
(ΣB)2: Sum of data set B, squared (Step 2).
μA: Mean of data set A (Step 3)
μB: Mean of data set B (Step 3)
ΣA2: Sum of the squares of data set A (Step 4)
ΣB2: Sum of the squares of data set B (Step 4)
nA: Number of items in data set A
nB: Number of items in data set B
Step 6: Find the Degrees of freedom (nA-1 + nB-1) = 18
Step 7: Look up your degrees of freedom (Step 6) in the t-table. If you don’t know what
your alpha level is, use 5% (0.05).
18 degrees of freedom at an alpha level of 0.05 = 2.10.
Step 8: Compare your calculated value (Step 5) to your table value (Step 7). The calculated
value of -1.79 is less than the cut-off of 2.10 from the table. Therefore p > .05. As the p-
value is greater than the alpha level, we cannot conclude that there is a difference between
means.
4. A study of the effect of caffeine on muscle metabolism used eighteen male volunteers who
each underwent arm exercise tests. Nine of the men were randomly selected to take a
capsule containing pure caffeine one hour before the test. The other men received a placebo
capsule. During each exercise the subject's respiratory exchange ratio (RER) was measured.
(RER is the ratio of CO2 produced to O2 consumed and is an indicator of whether energy is
being obtained from carbohydrates or fats).
The question of interest to the experimenter was whether, on average, caffeine changes RER.
The two populations being compared are “men who have not taken caffeine” and “men who
have taken caffeine”. If caffeine has no effect on RER the two sets of data can be regarded as
having come from the same population.
The results were as follows:
RER(%)
Placebo Caffeine
105 96
119 99
100 94
97 89
96 96
101 93
94 88
95 105
98 88
SD 7.70 5.61
The means show that, on average, caffeine appears to have altered RER from about 100.6%
to 94.2%, a change of 6.4%. However, there is a great deal of variation between the data
values in both samples and considerable overlap between them. So is the difference between
the two means simply due sampling variation, or does the data provide evidence that caffeine
does, on average, reduce RER? The p-value obtained from an independent samples t-test
answers this question.
The t-test tests the null hypothesis that the mean of the caffeine treatment equals the mean of
the placebo versus the alternative hypothesis that the mean of caffeine treatment is not equal
to the mean of the placebo treatment.
Computer output obtained for the RER data gives the sample means and the 95% confidence
interval for the difference between the means.
Computer output
The Independent Samples t-test in Minitab
Enter the data from both samples into one column and the group identity in a second column,
then select
Stat > Basic Statistics > 2-Sample t... to perform an independent sample t-test in Minitab
Two Sample T-Test and Confidence Interval
Two sample T for Caffeine vs Placebo
N Mean StDev SE Mean
5. Data measured on n=3,539 participants who attended the seventh examination of the
Offspring in the Framingham Heart Study are shown below.
Men Women
Characteristic n S n s
Suppose we now wish to assess whether there is a statistically significant difference in mean
systolic blood pressures between men and women using a 5% level of significance.
H0: μ1 = μ2
H1: μ1 ≠ μ2α=0.05
Because both samples are large (> 30), we can use the Z test statistic as opposed to t. Note
that statistical computing packages use t throughout. Before implementing the formula, we
first check whether the assumption of equality of population variances is reasonable. The
guideline suggests investigating the ratio of the sample variances, s 12/s22. Suppose we call
the men group 1 and the women group 2. Again, this is arbitrary; it only needs to be noted
when interpreting the results. The ratio of the sample variances is 17.52/20.12 = 0.76, which
falls between 0.5 and 2 suggesting that the assumption of equality of population variances is
reasonable. The appropriate test statistic is
This is a two-tailed test, using a Z statistic and a 5% level of significance. Reject H0 if Z < -
1.960 or is Z > 1.960.
We now substitute the sample data into the formula for the test statistic identified in Step 2.
Before substituting, we will first compute Sp, the pooled estimate of the common standard
deviation.
Notice that the pooled estimate of the common standard deviation, Sp, falls in between the
standard deviations in the comparison groups (i.e., 17.5 and 20.1). Sp is slightly closer in
value to the standard deviation in the women (20.1) as there were slightly more women in the
sample. Recall, Sp is a weight average of the standard deviations in the comparison groups,
weighted by the respective sample sizes.
Step 5. Conclusion.
We reject H0 because 2.66 > 1.960. We have statistically significant evidence at α=0.05 to
show that there is a difference in mean systolic blood pressures between men and
women. The p-value is p < 0.010.
Here again we find that there is a statistically significant difference in mean systolic blood
pressures between men and women at p < 0.010. Notice that there is a very small difference
in the sample means (128.2-126.5 = 1.7 units), but this difference is beyond what would be
expected by chance. Is this a clinically meaningful difference? The large sample size in this
example is driving the statistical significance. A 95% confidence interval for the difference in
mean systolic blood pressures is: 1.7 + 1.26 or (0.44, 2.96). The confidence interval provides
an assessment of the magnitude of the difference between means whereas the test of
hypothesis and p-value provide an assessment of the statistical significance of the difference.
Above we performed a study to evaluate a new drug designed to lower total cholesterol. The
study involved one sample of patients, each patient took the new drug for 6 weeks and had
their cholesterol measured. As a means of evaluating the efficacy of the new drug, the mean
total cholesterol following 6 weeks of treatment was compared to the NCHS-reported mean
total cholesterol level in 2002 for all adults of 203. At the end of the example, we discussed
the appropriateness of the fixed comparator as well as an alternative study design to evaluate
the effect of the new drug involving two treatment groups, where one group receives the new
drug and the other does not. Here, we revisit the example with a concurrent or parallel control
group, which is very typical in randomized controlled trials or clinical trials
The correlated samples t-test, also called the direct difference t-test, compares scores from
two conditions in a within-subjects design or two groups in a matched-subjects design. The
formula shown here uses values that you have already learned to compute: Means,
Variances and Standard Deviations, and a product-moment correlation.
In this example, anxiety ratings were made before and after a brief treatment session. The
ratings range from 1 (no anxiety) to 5 (extreme anxiety). Pretest and posttest ratings of
anxiety are made on 91 subjects. Listed here are the summary statistics computed from the
raw data.
Group 1 Group 2
Sample Size N = 91
Correlation r = 0.84
Use the formula below to compute the value of t.
Look up the critical value of t (df = number of pairs - 1 = 91 - 1 = 90). With an alpha of 0.05,
the critical value of t is between 1.98 and 2.00. Because the observed t (3.75) exceeds the
critical value, we reject the null hypothesis and conclude that the treatment produced a
significant reduction in anxiety.
n=15
The observed t-value is smaller than the critical t-value (1%, one-tailed). The learning
program has a positive effect. Subjects remember more words after the training.
2. Salary Wizard is an online tool that allows you to look up incomes for specific jobs for cities
in the United State. We looked up the 25th percentile for income for six jobs in two cities:
Boise, Idaho, and Los Angeles, California. The data are below.
## Jobs Boise Los Angeles
## 1 Executive chef 53047 62490
## 2 Genetics counselor 49958 58850
## 3 Grants/proposal writer 41974 49445
## 4 Librarian 44366 52263
## 5 Public schoolteacher 40470 47674
## 6 Social worker (with bachelor's degree) 36963 43542
How can we conduct a paired-samples t test to determine whether income in one of these
cities differs, on average, from income in the other? We will use a two-tailed test and a
significant level of 0.05.
Notice that the same individual (income for specific job) is measured twice (two different
cities).
Step 1: Assumptions
For the one-sample t test, we use individual scores; for the paired-samples t test, we
use difference scores. The comparison distribution is a distribution of mean difference
scores (rather than a distribution of means).
Population 1: Job types in Boise, Idaho.
Population 2: Job types in Los Angeles, California
The comparison distribution will be a distribution of mean differences.
The hypothesis test will be a paired-samples t test because we have two samples, and all
participants are in both samples.
We do not know whether the population is normally distributed, there are only 12 observations
in total, and there is not much variability in the data in our samples, so we should proceed
with caution. The data were not randomly selected, so we should be cautious when
generalizing beyond this sample of job types.
Step 2: State the null and alternative hypotheses.
Remember, hypotheses are always about populations, not about our specific samples.
Null hypothesis: Jobs in Boise pay the same, on average, as jobs in Los Angeles: \[H_0:
\mu_1 = \mu_2\]
Alternative hypothesis: Jobs in Boise pay different incomes, on average, from jobs in Los
Angeles
\[H_a: \mu_1 \neq \mu_2\]
Step 3: Determine the characteristics of the comparison distribution
With the paired-samples t test, we have a sample of difference scores. According to the null
hypothesis, there is no difference; that is, the mean difference score is 0. So the mean of the
comparison distribution is always 0, as long as the null hypothesis posits no difference.
Therefore, we calculate the difference between the scores, then we basically consider them
our sample to perform a one sample t-test.
The characteristics of the comparison distribution are what values of mean and standard
deviation we will use in our test statistic.
## Jobs Boise Los Angeles Difference
## 1 Executive chef 53047 62490 9443
## 2 Genetics counselor4 9958 58850 8892
## 3 Grants/proposal writer 41974 49445 7471
## 4 Librarian 44366 52263 7897
## 5 Public schoolteacher 40470 47674 7204
## 6 Social worker (with bachelor's degree) 36963 43542 6579
We do not know the standard deviation of the true difference between the samples right? So
we have to calculate that as well as the mean for the difference.
The mean is
## [1] 7914
And the standard deviation is:
\[s_\bar{D} = \dfrac{s}{\sqrt{n}}\]
## [1] 438.8
Step 4: Determine the significance level
The exercise already did that for us.
\[\alpha = 0.05\]
Step 5: Calculate Test Statistic
This step is identical to that for the single- sample t test.
\[t = \dfrac{\bar{d} - \mu}{s_{\bar{D}}} = \dfrac{7914.333 - 0}{438.8313}\].
Hence, t statistic is
## [1] 18.04
Step 6.1: Conclude (Statiscal way)
Because this is a two-sided test (alternative hypothesis: not equal), p-value is going to be:
\[P(|T| > 18.03502) = 2P(T>18.03502)\].
Let's begin.
1. Define Null and Alternative Hypotheses
Figure 2.
2. State Alpha
Alpha = 0.05
3. Calculate Degrees of Freedom
Figure 3.
4. State Decision Rule
Using an alpha of 0.05 with a two-tailed test with 9 degrees of freedom, we would expect our
distribution to look something like this:
Figure 4.
Use the t-table to look up a two-tailed test with 9 degrees of freedom and an alpha of 0.05.
We find a critical value of 2.2622. Thus, our decision rule for this two-tailed test is:
If t is less than -2.2622, or greater than 2.2622, reject the null hypothesis.
5. Calculate Test Statistic
The first step is for us to calculate the difference score for each pairing:
Figure 5.
Figure 6.
6. State Results
t = 3.61
Result: Reject the null hypothesis.
7. State Conclusion
The anti-hunger weight loss pill significantly affected hunger, t = 3.61, p < 0.05.
4. Calculate a paired t test by hand for the following data:
Step 6: Subtract 1 from the sample size to get the degrees of freedom. We have 11 items, so
11-1 = 10.
Step 7: Find the p-value in the t-table, using the degrees of freedom in Step 6. If you don’t
have a specified alpha level, use 0.05 (5%). For this sample problem, with df=10, the t-value
is 2.228.
Step 8: Compare your t-table value from Step 7 (2.228) to your calculated t-value (-2.74). The
calculated t-value is greater than the table value at an alpha level of .05. The p-value is less
than the alpha level: p <.05. We can reject the null hypothesis that there is no difference
between means.
Note: You can ignore the minus sign when comparing the two t-values, as ± indicates the
direction; the p-value remains the same for both directions.
5. Suppose we wish to determine if the cholesterol levels of the men in Dixon and Massey
study changed from 1952 to 1962. We will use the paired t-test.
For α = 0.05 and 19 df, the critical value of t is 2.093. Since | -6.7| > 2.093, we reject H0 and
state that we have significant evidence that the average difference in cholesterol from 1952 to
1962 is NOT 0. Specifically, there was an average decrease of 69.8 from 1952 to 1962.
d=x1-x2;
run;
var d;
run;
Hypotheses:
dchol=chol62-chol52;
run;
run;
Again, we reject H0 (because p<0.05) and state that we have significant evidence that
cholesterol levels changed from 1952 to 1962, with an average decrease of 69.8 units, with
95% confidence limits of (-91.6, -48.0).
var dchol;
run;
Note that the t option produces the t statistic for testing the null hypothesis that the mean of a
variable is equal to zero, and the prt option gives the associated p-value. The clm option
produces a 95% confidence interval for the mean. In this case, where the variable is a
difference, dchol, the null hypothesis is that the mean difference is zero and the 95%
confidence interval is for the mean difference.
var dchol;
run;
A third method is to use the original data with the paired option in proc t-test:
paired x1*x2;
run;
Example:
paired chol62*chol52;
run;
Reporting results