Академический Документы
Профессиональный Документы
Культура Документы
Examining Distributions
Definition. Individuals are the objects described by a set of data.
Individuals may be people, but they may also be animals or things.
A variable is any characteristic of an individual. A variable can take
dierent values for dierent individuals.
Categorical Variables
Definition. A bar chart reects the number of individuals falling into
dierent categories by plotting the categories along the xaxis and the
numbers along the yaxis. A pie chart reects the number of individuals falling into dierent categories by representing the categories
as sectors of a circle with the number of individuals in the category
reected by the area of the sector.
Example. (See TM-1.)
Drawing Histograms
Definition. A histogram groups together quantitative variables and
reects the number of individuals in each category along the yaxis.
1
Example 1.2. Consider the data in Table 1.1 (see TM-2). We group
the data into classes of width 1 ( say (4.0, 5.0], (5.0, 6.0], etc.). We
then have the data as:
Class
Count
Class
Count
Class
Count
4.1 to 5.0
9.1 to 10.0
14.1 to 15.0
5.1 to 6.0
10.1 to 11.0
15.1 to 16.0
6.1 to 7.0
11.1 to 12.0
16.1 to 17.0
7.1 to 8.0
12.1 to 13.0
10
17.1 to 18.0
8.1 to 9.0
13.1 to 14.0
12
18.1 to 19.0
The histogram representing this data is given in Figure 1.2 (see TM-3).
Interpreting Histograms
Definition. An outlier in any graph of data is an individual observation that falls outside the overall pattern of the graph.
Note. To describe the overall pattern of a distribution:
Give the center and the spread.
See if the distribution has a simple shape that you can describe in
a few words.
Definition. A distribution is symmetric if the right and left sides
of the histogram are approximately mirror images of each other. A
distribution is skewed to the right if the right side of the historgram
2
Stemplots
Definition. A stemplot is a way to represent quantitative data in which
each observation is seperated into a stem consisting of all but the nal
(rightmost) digit and a leaf, the nal digit. For the data of Table 1.1
(see TM-2) we have the stemplot:
4 2
5
6
7
8 8
9
10 111568999
11 244689
12 01234556779
13 112344457779
14 11579
15 1145
16
17
18 3
Notice that a stemplot looks like a histogram turned on end.
Note. When making a stemplot, you might desire to round data o
to the last digit of interest. You might also split stems to double the
number of stems. For example, the stems 11 and 12 above could be
split in half to give:
11 244
11 689
12 01234
12 556779
4
Time Plots
Definition. A time plot of a variable plots each observation against
the time at which it was measured. Always mark the time scale on the
horizontal axis and the variable of interest on the vertical axis. If there
are not too many points, connecting the points by straight lines helps
show the pattern of changes over time.
Example 1.5. Here are data on the rate of deaths from cancer (deaths
per 100,000 people) in the United States over the 50-year period 1940
to 1990:
Year
1940
1945
1950
1955
1960
1965
1970
1975
1980
1985
1990
x1 + x2 + + xn
n
or in more compact notation
x=
n
1
x=
xi .
n i=1
Example 1.6. The mean of the data in Table 1.3 (see TM-13) is 54.8
years.
Note. To compute the mean of a data set using the Sharp EL-546G,
do the following:
Put the calculator in statistics mode by pressing MODE and
3.
Press 0 to put the calculator in single-variable statistics mode
(ST0 appears in the display).
1
Q3
Maximum
or more compactly,
s2 =
n
1
(xi x)2 .
n 1 i=1
n
1
(xi x)2 .
n 1 i=1
(Some texts call these the sample variance and sample standard deviation - versus the population variance and standard deviation.)
Example 1.11. Consider the data set:
1792 1666 1362 1614 1460 1867 1439
The mean is x = 1600. We can calculate the variance as:
Observations
Deviations
Squared Deviations
xi
xi x
(xi x)2
1792
1666
1666 1600 = 66
662 = 4, 356
1362
1614
1614 1600 = 14
142 = 196
1460
1867
1439
sum = 0
sum = 214,870
So the variance is
s2 =
n
1
1
(xi x)2 = (214, 870) = 35, 811.67.
n 1 i=1
6
Note. A VERY common class of density curves is the normal distributions. These curves are symmetric, single-peaked, and bell-shaped. All
normal distributions have the same shape and are determined solely by
their mean and standard deviation . Figure 1.19 (see TM-21) gives
two examples of normal distributions. The points at which the curves
change concavity are located a distance on either side of . We will
use the area under these curves to represent a percentage of observations. (These areas correspond to integrals, for those of you with some
experience with calculus.)
Note. In the normal distribution with mean and standard deviation
:
68% of the observations fall within of the mean .
95% of the observations fall within 2 of .
99.7% of the observations fall within 3 of .
This is called the 68-95-99.7 Rule. See Figure 1.20 (and TM-22).
Notation. We abbreviate the normal distribution with mean and
standard deviation as N (, ).
x
.
distribution.
2.5 in
So we want to find the area to the LEFT of 1.4 in the standard normal
z=
distribution (the question says less than). See Figure 1.22 (and TM24). Well find this area after one more comment.
Note. Table A is a table of areas under the standard normal curve.
The table entry for each value z is the area under the curve to the left
of z. Table A is reproduced also on TM-139 and TM-140.
Solution to Example 1.15 (continued). We now see that we want
the entry in Table A that corresponds to z = 1.4 This entry is 0.9192.
Therefore 91.92% of the population of young women are less than 68
inches tall.
4
30
We want the area to the RIGHT of z = 2.33 in N (0, 1) (the question
says more than). Well, the area to the left of z = 2.33 is (Table A or
the calculator) .9901. Since the total area under a normal distribution
is 1, the desired area is 1 .9901 = .0099. So .99% of such boys have
more than 240 md/dl of cholesterol. See Figure 1.23 (and TM-25).
Note. We can also calculate area to the RIGHT of a zscore using
the calculator:
Put the calculator in statistics mode by pressing MODE and
3.
Press 0 to put the calculator in single-variable statistics mode
(ST0 appears in the display).
Press the 2ndF key, then the R(t) key (the 3 key... R( appears), type in the z value, and hit = .
See page 44 of the calculator owners manual for more details.
Example 1.18. In the above example, what percent of 14-year-old
boys have blood cholesterol between 170 and 240 mg/dl?
Example 1.19. Scores on the SAT for verbal ability follow the N (430, 100)
distribution. How high must a student score in order to place in the
top 10% of all students taking the SAT?
Solution. We want the area to the LEFT of our z value to be 1.1 = .9
(we are interested in the complement of this area... the problem says
top 10%). From Table A, we have z = 1.28. Now converting this
x 430
back to a SAT score we solve
= 1.28 and get x = 558.
100
2.1 Scatterplots
Definition. A scatterplot shows the relationship between two quantitative variables measured on the same individuals. The values of one
variable appear on the horizontal axis, and the values of the other variable appear on the vertical axis. Each individual in the data appears
as the point in the plot xed by the values of both variables for that
individual. Always plot the explanatory variable, if there is one, on the
horizontal axis (the xaxis) of a scatterplot.
Interpreting Scatterplots
Definition. Two variables are positively associated when above-average
values of one tend to accompany above average values of the other
and below-average values also tend to occur together. Two variables
are negatively associated when above-average values of one accompany
below-average values of the other, and vice versa.
Example 2.4. Figure 2.1 (see TM-33) gives a scatterplot of the median
SAT math score in each state against the percent of that states high
school seniors who take the SAT. Notice that there are two clusters of
points (the reason is that the states in the left cluster contains those
states that primarily use the ACT exam - therefore, fewer of those
students take the SAT). Notice that the two variables in this plot are
negatively associated.
1
Note. The form of the data in Figure 2.1 (TM-33) is the two highly
visible clusters. The direction of the relationship between the data is
the negative association. The strength of the relationship is weak.
Definition. If the points of a scatterplot lie roughly along a straight
line, the relationship is said to be linear.
Example 2.5. The data in Table 2.2 and Figure 2.2 (see TM-34)
follows a linear relationship.
2.2 Correlation
Definition. The correlation measures the strength and direction of
the linear relationship between two quantitative variables. Correlation
is usually written as r. Suppose that we have data on variables x and
y for n individuals. The values for the rst individual are x1 and y1,
the values for the second individual are x2 and y2 , and so on. The
means and standard deviations of the two variables are x and sx for
the xvalues, and y and sy for the yvalues. The correlation r between
x and y is
n
1
xi x
r=
n 1 i=1
sx
y y
i
.
sy
Exercise 2.17. Consider the following measurements from the transition species Archaeopteryx (an evolutionary link between dinosaurs and
birds) of femur and humerus bones (in mm):
Femur
38 56 59 64 74
Humerus 41 63 70 72 84
(See Exercise 2.11, page 108.) Let x represent femur length and y
represent humerus length. Calculate r.
Solution. From the calculator, we have:
x = 58.2 sx = 13.2
y = 66
sy = 15.9
We then have:
1
-1.53
41
-1.57
2.40
56
-0.17
63
-0.19
0.03
59
0.06
70
0.25
0.02
64
0.44
72
0.38
0.17
74
1.20
84
1.13
1.36
sum = 3.98
Therefore
n
1
xi x
r=
n 1 i=1
sx
y
i
y 1
= (3.98) = 0.995.
sy
4
Note. Fortunately, these manipulations are built into the Sharp EL546G. Do the following:
Put the calculator in statistics mode by pressing MODE and
3.
Press 1 to put the calculator in two-variable statistics mode (ST1
appears in the display).
Press 2ndF and CA to clear the statistics memory.
Enter the data by putting in a x value, pressing (x, y) (the STO
key), putting in a y value, and pressing DATA (the M+ key).
Press RCL and r (the button).
See pages 45 and 48 of the calculator owners manual for more details.
You will note that you can also get x, y, sx , and sy using the RCL
button.
2
Femur
38 56 59 64 74
Humerus 41 63 70 72 84
We have seen that from the calculator, we have:
x = 58.2 sx = 13.2
y = 66
sy = 15.9
15.9
sy
b = r = .995
= 1.2
sx
13.2
and intercept
a = y bx = 66 1.2 58.2 = 3.8.
So the least-squares regression line is y = 3.8 1.2x.
Note. Fortunately, these manipulations are built into the Sharp EL546G. Do the following:
Put the calculator in statistics mode by pressing MODE and
3.
Press 1 to put the calculator in two-variable statistics mode (ST1
appears in the display).
Press 2ndF and CA to clear the statistics memory.
Enter the data by putting in a x value, pressing (x, y) (the STO
key), putting in a y value, and pressing DATA ( the M+ key).
For the intercept, a, press RCL and a (the ( button).
For the slope, b, press RCL and b (the ) button).
2
See pages 45 and 47 of the calculator owners manual for more details.
You will note that you can also get x, y, sx , and sy using the RCL
button.
Note. If you test the least-squares regression on the Archaeopteryx
data, you will notice that there is some roundo error in the numbers
presented above (you should get a = 3.65... from the calculator).
Facts about Least-Squares Regression
Example 2.11. Figure 2.11 (see TM-43) is a scatterplot of data that
played a central role in the discovery that the universe is expanding.
They are the distances from Earth of 24 spiral galaxies and the speed
at which these galaxies are moving away from us, reported by the astronomer Edwin Hubble in 1929. There is a positive linear relationship, r = .7842, so that the more distant galaxies are moving away
more rapidly. Astronomers believe that there is in fact a strong linear
relationship. The two lines on the plot are two least-squares regression
lines. The regression line of velocity on distance is solid. The regression
line of distance on velocity is dashed. Regression of velocity on distance
and regression of distance on velocity give dierent lines. In the regression setting, you must know clearly which variable is explanatory.
Note. The square of the correlation, r2 , is the fraction of the variation
in the values of y that is explained by the least-squares regression of y
on x. The idea is that when there is a linear relationship, some of the
3
Inuential Observations
Definition. An outlier is an observation that lies outside the overall
pattern of the other observations in a scatterplot. An observation can
be an outlier in the x direction, in the y direction, or both. An observation is inuential if removing it would markedly change the position
of the regression line. Points that are outliers in the x direction are
often inuential.
Note. See Figure 2.15 (and TM-48) for an example of an inuential
data point in the Gesell data (namely, Child 18).
Lurking Variables
Definition. A lurking variable is a variable that has an important eect
on the relationship among the variables in a study but is not included
among the variables studied. A lurking variable can falsely suggest a
strong relationship between x and y, or it can hide a relationship that
is really there.
Example 2.15. The National Halothane Study was a major investigation of the safety of the anesthetic used in surgery. Records of
over 850,000 operations performed in 34 major hospitals showed the
following death rates for four common anesthetics:
Anesthetic
rate of patients. Anesthetic C appears dangerous. But there are obvious lurking variables: the age and condition of the patient and the
seriousness of the surgery. In fact, anesthetic C was more often used in
serious operations on older patients in poor condition. The death rate
would be higher among these patients no matter what anesthetic they
received. After measuring the lurking variables and adjusting for their
eect, the apparent relationship between anesthetic and death rate is
very much weaker.
Marginal Distributions
Note. The distributions of education alone and age alone in Table 2.5
(and TM-51) are called marginal distributions because they appear at
the right and bottom margins of the two-way table.
Describing Relationships
Note. We can describe relationships among categorical variables by
calculating appropriate percents from the counts given.
Example 2.21. From Table 2.5 (and TM-51), what percent of people
aged 25 to 34 have completed 4 years of college? Well, there are a
total of 42,905 people who are aged 25 to 34, and of those 10,168 have
1
1-3 years
4 years
13.9
40.8
21.6
23.7
Simpsons Paradox
Example 2.23. Two hospitals A and B provide the following initial
data:
Hospital A Hospital B
Died
63
16
Survived
2037
784
Total
2100
800
Good Condition
Hospital A Hospital B
Died
Survived
594
592
Total
600
600
Poor Condition
Hospital A Hospital B
Died
57
Survived
1443
192
Total
1500
200
seperate SRS in each stratum and combine these SRSs to form the full
sample.
Note. Another common means of restricting random selection is to
choose the sample in stages. This is usual practice for national samples
of households or people. For example, government data on employment
and unemployment are gathered by the Current Population Survey,
which conducts interviews in about 60,000 households each month. It
is not practical to maintain a list of all U.S. households from which
to select an SRS. Moreover, the cost of sending interviewers to the
widely scattered households in an SRS would be too high. The Current
Population Survey therefore uses a multistage sample design. The nal
sample consists of clusters of nearby households. Most opinion polls
and other national samples are also multistage.
attitude suggests that some answers are more desirable than others
will get these answers more often. The wording of questions is the most
important inuence on the answers given to a sample survey.
Example 3.7(a). When Levi Strauss & Co. asked students to choose
the most popular clothing item from a list, 90% chose Levis 501 jeans
- but they were the only jeans listed.
Example 3.7(a). A survey paid for by makers of disposable diapers
found that 84% of the sample opposed banning disposable diapers. here
is the actual question:
It is estimated that disposable diapers account for less
than 2% of the trash in todays landlls. In contrast,
beverage containers, third-class mail and yard wastes
are estimated to account for about 21% of the trash
in landlls. Given this, in your opinion, would it be
fair to ban disposable diapers?
This question gives information on only one side of an issue, then asks
an opinion. Thats a sure way to bias the responses. A dierent question that described how long disposable diapers take to decay and how
many tons they contribute to landlls each year would draw a quite
dierent response.
Comparative Experiments
Example 3.10. Ulcers in the upper intestine are unfortunately common in modern society. Gastric freezing is a clever treatment for
ulcers. The patient swallows a deated baloon with tubes attached,
then a refrigerated solution is pumped through the balloon for an hour.
The idea is that cooling the stomach will reduce its production of acid
and so relieve ulcers. An experiment reported in the Journal of the
American Medical Association showed that gastric freezing did reduce
acid production and relieve ulcer pain. The treatment was safe and easy
and was widely used for several years. The gastric freezing experiment
was poorly designed. The patients response may have been due to the
placebo eect. A placebo is a dummy treatment that can have no physical eect. Many patients respond favorably to any treatment, even a
placebo, presumably because of trust in the doctor and expectations
of a cure. This response to a dummy treatment is the palcebo eect.
A second experiment, done several years later, divided ulcer patients
into two groups. One group was treated by gastric freezing as before.
2
is a matched pairs design in which each subject compares the two colas.
Because responses may depend on which cola is tasted rst, the order of
tasting should be chosen at random for each subject. When more than
half the Coker drinkers chose Pepsi, Coke claimed that the experiment
was biased. The Pepsi glasses were marked M and Coke glasses were
marked Q. Aha, said Coke, this just shows that people like the letter
M better than the letter Q. A careful experiment would in fact take
care to avoid any distinction other than the actual treatments.
Sampling Variability
Definition. The fact that the value of a statistic varies in repeated
random sampling is called sampling variability.
Note. To see what would happen if we take many samples:
Take a large number of samples from the same population.
Calculate the sample proportion p for each sample.
Make a histogram of the values of p.
Examine the distribution displayed in the histogram for overall
pattern, center and spread, and outliers or other deviations.
Definition. Using random digits from a table or computer software to
imitate chance behavior is called simulation.
Example. Flip a coin twice. The possible outcomes (called the sample
space) are: HH, HT, TH, TT. The probability of getting at least one H
is 3/4. The probability of getting no H is 1/4.
Definition. A discrete random variable X has a finite number of possible values. The probability distribution of X lists the values and their
probabilities:
Value of X x1 x2 x3 xk
Probability p1 p2 p3 pk
Example 4.9. A household is a group of people living together, regardless of their relationship to each other. Many sample surveys such
as the Current Population Survey select a random sample of households. Choose a household at random, and let the random variable X
be the number of people living there. Here is the distribution of X.
Household size
Probability
The probability that a randomly chosen household has more than two
members is
P (X > 2) = P (X = 3) + P (X = 4) + P (X = 5) + P (X = 6) + P (X = 7)
= .171 + .154 + .067 + .022 + .014 = .428
P (A) =
Example 4.10. Roll two dice and record the pips (dots) on each of the
two up-faces. Figure 4.8 (see TM-65) shows the 36 possible outcomes.
If the dice are carefully made, all 36 outcomes are equally likely. So
each has probability 1/36. Gamblers are often interested in the sum
of the pips on the up faces. What is the probability of rolling a 5?
The event roll a 5 contains the four outcomes: (1,4), (2,3), (3,2),
(4,1). The probability is therefore 4/36 = 1/9 = 0.111. What about
the probability of rolling a 7? In Figure 4.8 (TM-65) you will find
six outcomes for which the sum of the pips is 7. The probability is
6/36 = 1/6 = 0.167.
n
i=1
x i pi .
n
(xi )2pi .
i=1
xP (x) dx
= (x )2 P (x) dx,
4
where the integrals are taken over all possible values of X. The standard
deviation is the square root of the variance.
p)
.
n
Note. As a rule of thumb, use the recipe for the standard deviation of
p only when the population is at least 10 times as large as the sample.
Example 4.14. You ask an SRS of 1500 rst-year college students
whether they applied for admission to any other college. There are
over 1.7 million rst-year college students, so the rule of thumb is easily satised. In fact, 35% of all rst-year students applied to colleges
besides the one they are attending. What is the probability that your
sample will give a result within 2 percentage points of this true value?
We have an SRS of n = 1500 drawn from a population in which the
proportion p = .35 applied to other colleges. The sample proportion p
has mean 0.35 and standard deviation
p(1
p)
=
n
(.35)(.65)
1500
= .0123.
We want the probability that p falls between 0.33 and 0.37 (within
2 percentage points, or 0.02, of 0.35). This is a normal distribution
calculation. Standardize p by subtracting its mean 0.35 and dividing
by its standard deviation 0.123. That produces a new statistic that has
the standard normal distribution. It is usual to call such a statistic Z:
Z=
p .35
.
.0123
Then draw a picture of the areas under the standard normal curve
2
.0123
.0123
.0123
= P (1.63 Z 1.63) = .9484 .0516 = .8968.
We see that almost 90% of all samples will give a result within 2 percentage points of the truth about the population.
Using the Normal Approximation for p
Note. As a second rule of thumb, we will use the normal approximation
to the sampling distribution of p for values of n and p that satisfy
np 10 and n(1 p) 10.
Example 4.15. One way of checking the eect of undercoverage, nonresponse, and other sources of error in a sample survey is to compare
the sample with known facts about the population. About 11% of
American adults are black. The proportion p of blacks in an SRS of
1500 adults should therefore be close to 11%. It is unlikely to be exactly 11% because of sampling variability. If a national sample contains
only 9.2% blacks, should we suspect that the sampling procedure is
somehow underrepresenting blacks? We will nd the probability that
a sample contains no more than 9.2% blacks when the population is
11% black. First, check our rule of thumb for using the normal approximation to the sampling distribution of p: np = (1500)(.11) = 165 and
n(1 p) = (1500)(.89) = 1335. Both are much larger than 10, so the
3
p)
=
n
(.11)(.89)
1500
= .00808.
= P (Z 2.23) = .0129.
.00808
.00808
Only 1.29% of all samples would have so few blacks. Because it is
unlikely that a sample would include so few blacks, we have good reason
to suspect that the sampling procedure underrepresents blacks.
Sample Counts
Note. Sometimes we are interested in the count of special individuals
in a sample rather than the proportion of such individuals. To deal
with these problems, just restate them in term of proportions.
n
k
n!
k!(n k)!
for k = 0, 1, 2, . . . , n.
Note. The binomial function (also called the combinationsfunction
n
do
denoted n Cr ) is built into the Sharp EL-546G. To calculate
r
the following:
Enter n.
Press
n Cr
(nCr appears).
Enter r.
Press = .
This function works in any mode. See page 23 of the calculator owners
manual for more details.
Definition. If X has the binomial distribution with n observations
and probability p of success on each observation, the possible values of
X are 0, 1, 2, . . . , n. If k is any one of these values,
P (X = k) =
n
k
k
p (1
p)nk .
P (X = 2) =
5
2
(.25)2(1
.25)3 = .26.
np(1 p)
np(1 p) =
is / n.
Note. The behavior of x in repeated samples is much like that of the
sample proportion p:
1
You should only use the recipe / n for the standard deviation of
x when the population is at least 10 times as large as the sample.
Note. Notice that these facts about the mean and standard
deviation of x are true no matter what the shape of the population distribution is.
Example 4.24. The height of young women varies approximately
according to the N (64.5, 2.5) distribution. This is a population distribution with = 64.5 and = 2.5. If we choose one young woman at
random, the heights we get in repeated choices follow this distribution.
That is, the distribution of the population is also the distribution of
one observation chosen at random. So we can think of the population
distribution as a distribution of probabilities, just like a sampling distribution. Now measure the height of an SRS of 10 young women. The
sampling distribution of their sample mean height x will have mean
Note. The fact that averages of several observations are less variable
than individual observations is important in many settings.
standard deviation / n.
Theorem (Central Limit Theorem). Draw an SRS of size n from
any population whatsoever with mean and finite standard deviation
. When n is large, the sampling distribution of the sample mean x is
deviation / n.
Example 4.25. Figure 4.19 (and TM-74) shows the central limit theorem in action for a very nonnormal population. Figure 4.19(a) displays
the density curve for the distribution of the population. The distribution is strongly right skewed, and the most probable outcomes are
near 0 at one end of the range of possible values. The mean of this
distribution is 1 and its standard deviation is also 1. This particular
distribution is called an exponential distribution from the shape of its
density curve. Exponential distributions are used to describe the lifetime in service of electronic components and the time required to serve
a customer or repair a machine. Figures 4.19(b), (c), and (d) are the
3
the value 1/ n. The density curve for 10 observations is still somewhat skewed to the right but already resembles a normal curve with
Note. The Law of Large Numbers states: Draw observations at random from any population with finite mean . As the number of observations drawn increases, the mean x of the observed values gets closer
and closer to .
Example. Four points, which are circled in Figure 4.21 (see TM-78),
lie above the upper control limit of the control chart. The 99.7 part
of the 68-95-99.7 rule says that the probability is only 0.003 that a
particular point would fall outside the control limits if and remain
at their target values.
A process in control is predictable. We can predict both the quantity and the quality of items produced.
When a process is in control we can easily see the eects of attempts to improve the process, which are not hidden by the unpredicatable variation that characterizes lack of statistical control.
Statistical Condence
Definition. A confidence interval is of the form
estimate margin of error.
The margin of error shows how accurate we believe our guess is, based
on the variability of the estimate.
Example 5.2. The NAEP survey includes a short test of quantitative
skills, covering mainly basic arithmetic and the ability to apply it to
realistic problems. Scores on the test range from 0 to 500. For example,
a person who scores 233 can add the amounts of two checks appearing
on a bank deposit slip; someone scoring 325 can determine the price
of a meal from a menu; a person scoring 375 can transform a price in
1
cents per ounce into dollars per pound. In a recent year, 840 men 21
to 25 years of age were in the NAEP sample. Their mean quantitative
score was x = 272. These 840 men are an SRS from the population of
all young men. On the basis of this sample, what can we say about the
mean score in the population of all 9.5 million young men of these
ages?
Solution. The standard deviation of x is / n = 60/ 840 = 2.1. Figure 5.1 (and TM-80) gives the sampling distribution for x. If we want
a 95% condence interval for , we should go two standard deviations
from the sample mean (recall the 68-95-99.7 rule). Since x = 272, and
the sample standard of deviation is 2.1, we set the margin of error equal
to 2 2.1 = 4.2 and so the condence interval is from 272 4.2 = 267.8
to 272 + 4.2 = 276.2. Therefore we can say that we are 95% condent
that the population mean lies between 267.8 and 276.2.
Condence Intervals
Note. Any condence interval has two parts: an interval computed
from the data and a confidence level giving the probability that the
method produces an interval that covers the parameter.
Definition. A level C confidence interval for a parameter is an interval
computed from sample data by a method that has probability C of
producing an interval containing the true value of the parameter.
90%
.05
1.645
95%
.025
1.960
99%
.005
2.576
x z .
n
3
Here z is the upper (1 C)/2 critical value for the standard normal
distribution, found in Table C (TM-142). This interval for the standard
normal distribution is normal and is approximately correct for large n
in other cases.
Example 5.4. A manufacturer of pharmaceutical products analyzes
a specimen from each batch of a product to verify the concentration
of the active ingredient. The chemical analysis is not perfectly precise.
Repeated measurements on the same specimen give slightly dierent
results. The results of repeated measurements follow a normal distribution quite closely. The analysis procedure has no bias, so the mean
of the population of all measurements is the true concentration in the
specimen. The standard deviation of this distribution is known to be
= .0068 grams per liter. The laboratory analyzes each specimen three
times and reports the mean result. Three analyses of one specimen give
concentrations
0.8403
0.8363
0.8447.
.0068
x z = .8404 = .8404 .0101 = (.8303, .8505).
n
3
We are 99% condent that the true concentration lies between 0.8303
and 0.8505 grams per liter.
4
z
1.96 .0068 2
n=
=
= 7.1.
m
.005
Because 7 measurements will give a slightly larger margin of error than
desired, and 8 measurements a slightly smaller margin of error, the lab
5
Some Cautions
Note. Some warnings:
The data must be an SRS from the population.
The formula is not correct for probability sampling designs more
complex than an SRS.
There is no correct method for inference from data haphazardly
collected with bias of unknown size.
Because x is strongly inuenced by a few extreme observations,
outliers can have a large eect on the condence interval.
If the sample size is small and the population is not normal, the
true condence level will be dierent from the value C used in
computing the interval.
You must know the standard deviation of the population.
0.4 2.2
Most are positive. That is, most tasters found a loss of sweetness. But
the loses are small, and two tasters (the negative scores) thought the
cola gained sweetness. Are these data good evidence that the cola lost
sweetness in storage?
Thats not a large loss. Ten dierent tasters would almost surely give
a dierent result. Maybe its just chance that produced this result. A
test of significance asks: Does the sample result x = 1.02 reect a real
loss of sweetness? OR Could we easily get the outcome x = 1.02 just
by chance?
Note. Next, state the null hypothesis. The null hypothesis says that
there is no eect or no change in the population. If the null hypothesis
is not true, the sample result is just chance at work. Here, the null
hypothesis says that the cola does not lose sweetness (no change). We
can write that in terms of the mean sweetness loss in the population
as H0 : = 0. We write H0 , read H-nought, to indicate the null
hypothesis. The eect we suspect is true, the alternative to no eect
or no change, is described by the alternate hypothesis. We suspect
that the cola does lose sweetness. In terms of the mean sweetness loss
, the alternative hypothesis is Ha : > 0.
Note. The reasoning of a signicance test goes like this.
Suppose for the sake of argument that the null hypothesis is true,
that on the average there is no loss of sweetness.
Is the sample outcome = 1.02 surprisingly large under that supposition? If it is, thats evidence against H0 and in favor of Ha .
To answer the question, we use our knowledge of how the sample mean
x would vary in repeated samples if H0 really were true. Thats the
sampling distribution of x once again.
2
Note. From long experience we also know that the standard deviation for all individual tasters is = 1. (It is not realistic to suppose
that we know the population standard devatiation . We will eliminate
this assumption in the next chapter.) The sampling distribution of x
from 10 tasters is then normal with mean = 0 and standard devia
the probability under the normal curve in Figure 5.8 (TM-86) to the
right of the observed x. This probability is called the P value. It is
the probability of a result at least as far out as the result we actually
got. The lower this probability, the more surprising our result, and the
stronger the evidence against the null hypothesis.
Note. Notice:
For one new cola, our 10 tasters gave x = .3. Figure 5.9 (and
TM-87) shows the P value for this outcome. It is the probability
to the right of 0.3. This probability is about 0.17. That is, 17%.
Our cola showed a larger sweetness loss, x = 1.02. The probability
of a result this large or larger is only 0.0006.
Note. Small P values are evidence against H0, because they say that
the observed result is unlikely to occur just by chance. Large P values
fail to give evidence against H0 . A P value of 0.05 is used as a common
rule of thumb. A result with a small P value, say less than 0.05, is
called statistically significant. Thats just a way of saying that chance
alone would rarely produce so extreme a result.
Outline of a Test
Note. Here is the reasoning of a signicance test in outline form:
1. Describe the eect you are searching for in terms of a population
4
shows the P value as an area under a normal curve. Figure 5.10 (and
TM-88) is the picture for this example. Then standardize x to get a
standard normal Z and use Table A (see TM-139, TM-140):
x0
.3 0
P (x .3) = P
.316
.316
= P (Z .95) = 1 .8289 = .1711
Note. We can compare the P value with a xed value that we regard
as decisive. This amounts to announcing in advance how much evidence
against H0 we will insist on. The decisive value of P is called the
significance level. We write it as , the Greek letter alpha. If we
choose = .05, we are requiring that the data give evidence against
H0 so strong that it would happen no more than 5% of the time when
H0 is true.
Definition. If the P value is as small or smaller than , we say that
the data are statistically significant at level .
Tests for a Population Mean
Note. We have an SRS of size n drawn from a normal population
with unknown mean . We want to test the hypothesis that has
a specied value. Call the specied value 0 . The null hypothesis
is H0 : = 0 . The test is based on the sample mean x. Because
normal calculations require standardized variables, we will use as our
is
8
P (Z z)
Ha : < 0
is
P (Z z)
Ha : = 0
is
P (Z |z|).
126.07 128
x 0
=
= 1.09.
/ n
15/ 72
Conclusion: More than 27% of the time, an SRS of size 72 from the
general male population would have a mean blood pressure at least as
far from 128 as that of the executive sample. The observed x = 126.07
is therefore not good evidence that executives dier from other men.
x 0
.
/ n
10
if
z z
Ha : < 0
if
z z
.8404 .86
= 4.99.
.0068/ 3
Step 3: Significance. Because the alternative is two-sided, we compare |z| = 4.99 with the /2 = .005 critical value from Table C (and
TM-142). This critical value is Z = 2.576. Figure 5.15 (and TM93) illustrates the values of z that are statistically signicant. Because
11
|z| > 2.576, we reject the null hypothesis and conclude (at the 1%
signicance level) that the concentration is not as claimed.
Note. The P value is the smallest level at which the data are
signicant. Knowing the P value allows us to assess signicance at
any level.
12
Error Probabilities
Example 5.19. The mean diameter of a type of bearing is supposed to
be 2.000 centimeters (cm). The bearing diameters vary normally with
standard deviation = .010 cm. When a lot of the bearings arrives,
the consumer takes an SRS of 5 bearings from the lot and measures
their diameters. The consumer rejects the bearings if the sample mean
diameter is signicantly dierent from 2 at the 5% level. This is a test
of the hypotheses:
H0 : = 2
Ha : = 2.
x2
.01/ 5
x2
1.96.
.01/ 5
or solving for x when 1.9912 x 2.0088.
1.96
Step 2. Find the probability of accepting H0 assuming that the alternative is true. Take = 2.015 and standardize to nd the probability:
P ( Type II error ) = P (1.9912 x 2.0088)
2
.01/ 5
.01/ 5
2.0088 2.015
.01 5
= P (5.32 Z 1.39) = .0823.
= P
Power
Definition. The probability that a xed level signicance test will
reject H0 when a particular alternative value of the parameter is true
is called the power of the test against that alternative. The power of
a test against any alternative is 1 minus the probability of a Type II
error for the alternative.
Example. The power of the test performed in the previous example
is 1 .0823 = .9177.
Dierent Views of Statistical Tests
Note. The way of thinking about statistical tests called testing hypotheses involves:
1. State H0 and Ha just as in a test of signicance. In particular, we
are seeking evidence against H0 .
2. Think of the problem as a decision problem, so that the probabilities
3
s/ n
has the t distribution with n 1 degrees of freedom.
t=
s
x t
n
where t is the upper (1C)/2 critical value for the t(n1) distribution.
This interval is exact when the population distribution is normal and is
approximately correct for large n in other cases. To test the hypothesis
H0 : = 0 based on an SRS of size n, compute the one-sample t
statistic
t=
x 0
.
s/ n
Example 6.3. The National Endowment for the Humanities sponsors summer institutes to improve the skills of high school language
teachers. One institute hosted 20 French teachers for four weeks. At
the beginning of the period, the teachers took the Modern Language
Associations listening test of understanding of spoken French. After
four weeks of immersion in French in and out of class, they took the
listening test again. (The actual spoken French in the two tests was
dierent, so that simply taking the rst test should not improve the
score on the second test.) Table 6.1 (and TM-101) gives the pretest
and posttest scores. The maximum possible score on the test is 36. To
analyze these data, subtract the pretest score from the posttest score
to obtain the improvement for each teacher. These 20 dierences form
a single sample. They appear in the Gain column in Table 6.1 (TM101). The rst teacher, for example, improved from 32 to 34, so the
gain is 34 32 = 2.
Step 1: Hypothesis. To assess whether the institute signicantly
improved the teachers comprehension of spoken French, we test
H0 : = 0
Ha : > 0.
Here is the mean improvement that would be achieved if the entire
population of French teachers attended a summer institute. The null
hypothesis says that no improvement occurs, and Ha says that posttest
scores are higher on the average.
Step 2: Test Statistic. The 20 dierences have x = 2.5 and s =
x0
2.5 0
= 3.86.
=
s/ n 2.893/ 20
Robustness of t Procedures
Definition. A condence interval or signicance test is called robust if
6
the condence level or P value does not change very much when the
assumptions of the procedure are violated.
Note. Use the t procedures when:
Except in the case of small samples, the assumption that the data
are an SRS from the population of interest is more important than
the assumption that the population distribution is normal.
Sample size less than 15. Use t procedures if the data are close to
normal. If the data are clearly nonnormal or if outliers are present,
do not use t.
Sample size at least 15. The t procedures can be used except in
the presence of outliers or strong skewness.
Large Samples. The t procedures can be used even for clearly
skewed distributions when the sample is large, roughly n 40.
Example 6.4. Consider several of the data sets we graphed in Chapter
1. Figure 6.6 (and TM-103) shows the histograms.
Figure 6.6(a) is a histogram of the percent of each states residents
who are over 65 years of age. We have data on the entire population
of 50 states, so formal inference makes no sense.
Figure 6.6(b) shows the time of the rst lightning strike each day
in a mountain region in Colorado. The data contain more than 70
observations that have a symmetric distribution. You can use the
t procedures to draw conclusions about the mean time of a days
7
Note. Standardize p by subtracting its mean and dividing by its standard deviation. The result is a z statistic:
p p
z=
.
p(1p)
n
p
(1
p)
.
n
(1
p
P )
n
p p0
p0 (1p0 )
n
is
P (Z z)
Ha : p < p 0
is
P (Z z)
Ha : p = p0
is
P (Z |z|)
of a head, which is the proportion of all tosses that give a head. The
tosses we actually make are an SRS from this population. The French
naturalist Count Buon (1707 - 1788) tossed a coin 4040 times. He got
2048 heads. The sample proportion of heads is
2048
= .5069.
4040
p =
Thats a bit more than one-half. Is this evidence that Buons coin was
not balanced? This is a job for a signicance test.
Step 1: Hypotheses. The null hypothesis says that the coin is balanced (p = .5). The alternative hypothesis is two-sided, because we did
not suspect before seeing the data that the coin favored either heads or
tails. We therefore test the hypotheses
H0 : p = .5
Ha : p = .5.
The null hypothesis gives the value p0 = .5.
Step 2: Test Statistic. The z test statistic is
z=
p p0
p0 (1p0 )
n
.5069 .5
(.5)(.5)
4040
= .88.
would happen 38% of the time when a balanced coin is tossed 4040
times. Buons result doesnt show that his coin is unbalanced.
Note. In Example 7.6, we failed to nd good evidence against H0 : p =
.5. We cannot conclude that H0 is true, that is that the coin is perfectly
balanced. No doubt p is not exactly 0.5. The test of signicance only
shows that the results of Buons 4040 tosses cant distinguish this
coin from one that is perfectly balanced. To see what values of p are
consistent with the sample results, use a condence interval.
Example 7.7. The 95% condence interval for the probability p that
Buons coin gives a head is
p z
(1
p
(.5069)(.4931)
p)
= .5069 1.960
= (.4915, .5223).
n
4040
2
z
m
p(1 p )
Example 7.8. Gloria Chavez and Ronald Flynn are candidates for
mayor in a large city. You are planning a sample survey to determine
what percent of the voters plan to vote for Chavez. This is a population
proportion p. You will contact an SRS of registered voters in the city.
You want to estimate p with 95% condence and a margin of error
no greater than 3%, or 0.03. How large a sample do you need? The
winners share in all but the most lopsided elections is between 30%
and 70% of the vote. So use the guess p = .5. The sample size you
need is
1.96 2
n
(.5)(1 .5) = 1.067.1.
.03
You should round the result up to n = 1068. If you want a 2.5% margin
of error, we have (after rounding)
1.96
n=
.025
2
1.96
n=
.02
2