Вы находитесь на странице: 1из 18

CHAPTER 6 What Is Normal?

Medical researchers have determined so-called normal intervals


for a person’s blood pressure, cholesterol, triglycerides, and the
like. For example, the normal range of systolic blood pressure is
110 to 140. The normal interval for a person’s triglycerides is
from 30 to 200 milligrams per deciliter (mg/dl). By measuring
these variables, a physician can determine if a patient’s vital
statistics are within the normal interval or if some type of
The Normal Distribution treatment is needed to correct a condition and avoid future
illnesses. The question then is, How does one determine the so-
called normal intervals?
In this chapter, you will learn how researchers determine normal
intervals for specific medical tests by using the normal
distribution. You will see how the same methods are used to
determine the lifetimes of batteries, the strength of ropes, and
many other traits.
6-1 6-2
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004

Objectives Objectives (cont’d.)

„ Identify distributions as symmetrical or „ Find specific data values for given percentages
skewed. using the standard normal distribution.

„ Identify the properties of the normal „ Use the central limit theorem to solve
distribution. problems involving sample means for large
samples.
„ Find the area under the standard normal
distribution, given various z values. „ Use the normal approximation to compute
probabilities for a binomial variable.
„ Find the probabilities for a normally
distributed variable by transforming it into a
standard normal variable.
6-3 6-4
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004
Introduction Introduction
„ The graphical form of the probability distribution of a „ The areas under a probability distribution correspond
continuous random variable x is a smooth that might to probabilities for x. For example, the area A beneath
appear as shown in the figure below. This curve, a the curve between the two points a and b, as shown
function of x, is denoted by the symbol f(x) and below, is the probability that x assumes a value
variously called a probability density function, a between a and b (a<x<b). P(a≤x≤b) = P(a<x<b) =A.
frequency distribution, or a probability distribution.

6-5 6-6
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004

The Uniform Distribution The Normal Distribution

„ Continuous random variables that appear to „ Many discrete or continuous variables have
have equally like outcomes over their range of distributions that are bell-shaped and are
possible values possess a uniform probability called approximately normally distributed
distribution, sometimes, it is referred to as the variables.
randomness distribution.
„ A normal distribution is also known as the
1 bell curve or the Gaussian distribution.
f ( x) = (c ≤ x ≤ d)
d −c
c+d
µ=
2
d −c
σ=
12
6-7 6-8
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004
An Example Normal and Skewed Distributions
If a researcher selects a random sample of 100 adult women, „ The normal distribution is a continuous, bell-
measures their heights, and constructs a histogram, the
researcher gets a graph similar to the shown in (a). As the shaped distribution of a variable.
researcher increases the sample size and decreases the width of
classes, the histogram will look like the ones shown in (b) and „ If the data values are evenly distributed about
(c). Finally, if it were possible to measure exactly the heights of the mean, the distribution is said to be
all adult women on Earth, the histogram would approach the
normal distribution. symmetrical.
„ If the majority of the data values fall to the left
or right of the mean, the distribution is said
to be skewed.

6-9 6-10
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004

Left Skewed Distributions Right Skewed Distributions

„ When the majority of the data values fall to „ When the majority of the data values fall to
the right of the mean, the distribution is said the left of the mean, the distribution is said to
to be negatively or left skewed. The mean is to be positively or right skewed. The mean falls
the left of the median, and the mean and the to the right of the median and both the mean
median are to the left of the mode. and the median fall to the right of the mode.

6-11 6-12
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004
Equation for a Normal Distribution Some Examples

„ The mathematical equation for the normal


distribution is:

2
(2σ 2 ) where
e−( X − µ ) e ≈ 2.718
y=
σ 2π π ≈ 3.14
µ = population mean
σ = population
standard deviation The above figure shows two normal distributions with the same
mean values but different standard deviation. The larger the
standard deviation, the more dispersed, or spread out, the
distribution is.
6-13 6-14
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004

Some Examples (cont’d) Some Examples (cont’d)

The above figure shows two normal distributions with the same The above figure shows two normal distributions with different
standards deviation but are located at different positions on the means and different standard deviations.
x axis, i.e. different means.

6-15 6-16
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004
Properties of the Normal Distribution Normal Distribution Properties

„ The shape and position of the normal „ The normal distribution curve is bell-shaped.
distribution curve depend on two parameters,
„ The mean, median, and mode are equal and
the mean and the standard deviation.
located at the center of the distribution.
„ Each normally distributed variable has its
„ The normal distribution curve is unimodal
own normal distribution curve, which
(i.e., it has only one mode).
depends on the values of the variable’s mean
and standard deviation. „ The curve is symmetrical about the mean,
which is equivalent to saying that its shape is
the same on both sides of a vertical line
passing through the center.
6-17 6-18
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004

Normal Distribution Properties (cont’d.) Normal Distribution Properties (cont’d.)

„ The curve is continuous—i.e., there are no „ The total area under the normal distribution
gaps or holes. For each value of X, here is a curve is equal to 1.00 or 100%.
corresponding value of Y.
„ The area under the normal curve that lies
„ The curve never touches the x axis. within one standard deviation of the mean is
Theoretically, no matter how far in either approximately 0.68, or 68%; within two
direction the curve extends, it never meets the standard deviations, about 0.95, or 95%; and
x axis—but it gets increasingly closer. within three standard deviations, about 0.997
or 99.7%.

6-19 6-20
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004
Visualisation of the Last Property Standard Normal Distribution

„ Since each normally distributed variable has


its own mean and standard deviation, the
shape and location of these curves will vary.
In practical applications, one would have to
have a table of areas under the curve for each
variable. To simplify this, statisticians use the
standard normal distribution.
„ The standard normal distribution is a normal
distribution with a mean of 0 and a standard
deviation of 1: y = e −X 2 /2


6-21 6-22
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004

Visualisation of Standard Normal Distribution z Values


The area under the curve is more important than „ All normally distributed variables can be
the frequencies because we are talking about the transformed into the standard normally
continuous case and the area corresponds to the distributed by using the z values.
probability!
„ The z value is the number of standard
deviations that a particular X value is away
from the mean. The formula for finding the z
value is:
value − mean X −µ
z= or z =
standard deviation σ

6-23 6-24
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004
Table of Standard Normal Distribution How to Use the Table?
The table on the previous slide gives the area between 0 and any
z value to the right of 0, one need only look up the z value in the
table. In order to find the area between 0 and, say, 0.54, first
one needs to find 0.5 in the left column and then 0.04 in the top
row. The value where the column and row meet in the table is
the answer, 0.2054, or 20.54%.

6-25 6-26
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004

Type 1: Area Between 0 and z Type 2: Area in Any Tail

„ To find the area between 0 and any z value: „ Look up the z value to get the area.
Look up the z value in the table.
„ Subtract the area from 0.5000.

0 z –z 0

6-27 6-28
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004
Type 3: Area Between Two z Values Type 4: Area Between z Values—Opposite Sides

„ Look up both z values to get the areas. „ Look up both z values to get the areas.
„ Subtract the smaller area from the larger „ Add the areas.
area.

0 z1 z2 –z1 0 z2

6-29 6-30
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004

Type 5: Area To the Left of Any z Value Type 6: Area To the Right of Any z Value

„ Look up the z value to get the area. „ Look up the z value in the table to get the
area.
„ Add 0.5000 to the area.
„ Add 0.5000 to the area.

0 z -z 0

6-31 6-32
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004
Type 7: Area in Both Tails Some Examples
As stated earlier, the area under the curve corresponds to a
probability. In order to compute the probability of the variable
falling between any two intervals (continuous case!), one only
needs to figure out the area between the two intervals. For
probabilities, a special notation is used. For example, if the
problem is to find the probability of any z value between 0 and
2.32, this probability is written as P(0< z <2.32).

Find the probability for each.


a. P(0< z <2.32)
b. P(z <1.65)
c. P(z >1.91)

6-33 6-34
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004

Some Examples (cont’d) Applications of the Normal Distribution


Find the z values such that the area under the normal The standard normal distribution curve can be used to solve a
distribution curve between 0 and the z value is 0.2131. wide variety of practical problems. The only requirement is that
the variable be normal or approximately normally distributed.
Solution:
There are several mathematical tests to determine whether a
Find the area in the table. Then read the correct z value in the variable is normally distributed. However, the discussion of the
left column as 0.5 and in the top row as 0.06 and add these two details is beyond the scope of this subject. For all the problem
values to get 0.56. appeared in this chapter, one can assume that the variable is
normally or approximately normally distributed.

Now we show the wide use of the normal distribution by a few


examples.

6-35 6-36
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004
Some Examples Some Example (cont’d)
1. The mean number of hours an American worker spends on STEP 2: Find the z value corresponding to 3.5.
the computer is 3.1 hours per workday. Assume the standard
z = (X - µ)/σ = (3.5 - 3.1)/0.5 = 0.80
deviation is 0.5 hour. Find the percentage of workers who spend
less than 3.5 hours on the computer. Assume the variable is Hence, 3.5 is 0.8 standard deviation above the mean of 3.1, as
normally distributed. (Source: USA Today) shown for the z distribution below.

Solution:
STEP 1: Draw the figure and represent the area as shown below.

STEP 3: Find the area using the Table of Normal Distribution.


The area between 0 and 0.8 is 0.2881. Since the are under the
curve to the left of z = 0.8 is desired, add 0.5 to 0.2881 (0.5 +
0.2881 = 0.7881).
6-37 6-38
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004

Some Example (cont’d) Some Example (cont’d)


2. Each month, an American household generates an average of STEP 2: Find the appropriate area, using the table. The area
28 pounds of newspaper for garbage or recycling. Assume the between z = 0 and z = -0.5 is 0.1915. The area between z = 0
standard deviation is 2 pounds. If a household is selected at and z = 1.5 is 0.4332. Add 0.1915 and 0.4332 (0.1915 + 0.4332
random, find the probability of its generating = 0.6247). Thus the total area is 62.47%.
a. Between 27 and 31 pounds per month.
b. More than 30.2 pounds per month.
Assume the variable is approximately normally distributed.

Solution a: Solution b:
STEP 1: Find the two z values. STEP 1: Find the z value.
z1 = (X - µ)/σ = (27-28)/2 = -0.5 z2 = (31-28)/2 = 1.5 z = (X - µ)/σ = (30.2-28)/2 = 1.1

6-39 6-40
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004
Some Example (cont’d) Some Example (cont’d)
STEP 2: Find the appropriate area, using the table. The area 3. To qualify for a police academy, candidates must score in the
between z = 0 and z = 1.1 is 0.3643. Since the desired area is in top 10% on a general ability test. The test has a mean of 200
the right tail, subtract 0.3643 from 0.5 and a standard deviation of 20. Find the lowest possible score to
qualify. Assume the test scores are normally distributed.
0.5 - 0.3643 = 0.1357
Hence, the probability that a randomly selected household will
accumulate more 30.2 pounds of newspaper is 0.1357, or Solution:
13.57%.
Since the test scores are normally distributed the test value (X)
that cuts off the upper 10% of the area under the normal
distribution curve is desired. This area is shown below.

0 1.1

6-41 6-42
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004

Some Example (cont’d) Distribution of Sample Means


STEP 1: Subtract 0.1 from 0.5 to get the area under the normal „ A sampling distribution of sample means is a
distribution between 200 and X: 0.5 - 0.1 = 0.4.
distribution obtained by using the means
STEP 2: Find the z value that corresponds to an area of 0.4 by
looking up 0.4 in the area portion of the Table. If the specific
computed from random samples of a specific
value cannot be found, use the closest value - in this case, size taken from a population.
0.3997. The corresponding z value is 1.28. (If the area falls
exactly halfway between two z values, use the larger of the two z „ Sampling error is the difference between the
values. For example the area 0.4500 falls halfway between
0.4495 and 0.4505. In this case use 1.65 rather than 1.64 for
sample measure and the corresponding
the z values.) population measure due to the fact that the
STEP 3: Substitute in the formula z = (X - µ)/σ and solve for X. sample is not a perfect representation of the
1.28 = (X - 200)/20 X = 226 population.
A score of 226 should be as a cut off.
6-43 6-44
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004
Properties of Distribution of Sample Means An Example

„ The mean of the sample means will be the The following example illustrates the above two properties.
Suppose a professor gave an 8-point quiz to a small class of four
same as the population mean. students. The results of the quiz were 2, 6, 4, and 8. For the
sake of discussion, assume that the four students constitute the
„ The standard deviation of the sample means population. The mean of the population is
will be smaller than the standard deviation of µ = (2+6+4+8)/4 = 5
the population, and will be equal to the The standard deviation of the population is
population standard deviation divided by the
σ = {[(2-5)2+(6-5)2+ (4-5)2+(8-5)2]/4}1/2 = 2.236
square root of the sample size.
The graph of the original distribution is shown below. This is
called a uniform distribution.

6-45 6-46
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004

An Example (cont’d) An Example (cont’d)


Now, if all samples of size 2 are taken with replacement and the For the data from the discussing example, the figure below gives
mean of each sample is found, the distribution is shown in (a). the graph of the sample means. The histogram appears to
approximately normal.
Sample Mean Sample Mean Sample Mean Frequency
2, 2 2 6, 2 4 2 1
2, 4 3 6, 4 5 3 2
2, 6 4 6, 6 6
4 3
2, 8 5 6, 8 7
5 4
4, 2 3 8, 2 5
6 3
4, 4 4 8, 4 6
4, 6 5 8, 6 7 7 2
4, 8 6 8, 8 8 8 1

(a) (b) The mean of the sample means, denoted by µ X , is

A frequency distribution of sample means is in (b). 2 *1 + 3 * 2 + ... + 8 *1 80


µX = = =5
16 16
6-47 6-48
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004
An Example (cont’d) The Central Limit Theorem
which is the same as the population mean. Hence
„ As the sample size n increases, the shape of
µX = µ
the distribution of the sample means taken
The standard deviation of sample means, denoted by σ X , is
with replacement from a population with
( 2 − 5 ) 2 *1+ ( 3− 5 ) 2 *2 +...+ ( 8 −5 ) 2 *1
σX = 16 = 1.581
mean µ and standard deviation σ will
Which is the same as the population standard deviation, approach a normal distribution.
divided by the square root of 2.
2.236
σX = = 1.581
2
(Note: Rounding rules were not used here in order to show that
the answers coincide.)

6-49 6-50
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004

Central Limit Theorem (cont’d.) Central Limit Theorem (cont’d.)

„ If all possible samples of size n are taken with „ The central limit theorem can be used to
replacement from the same population, the answer questions about sample means in the
mean of the sample means equals the same manner that the normal distribution
population mean or: µ X = µ . can be used to answer questions about
individual values.
„ The standard deviation of the sample means
„ A new formula must be used for the z values:
equals: σ X = σ and is called the standard
n X −µ
error of the mean. z=
σ n

6-51 6-52
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004
Central Limit Theorem (cont’d.) Central Limit Theorem (cont’d.)
If a large number of samples of a given size are selected from a It is important to remember two things when you use the central
normally distributed population, or if a large number of samples limit theorem:
of a given size that is greater than or equal to 30 are selected
1. When the original variable is normally distributed, the
from a population which is not normally distributed, and the
distribution of the sample means will be normally
sample means are computed, then the distribution of sample
distributed, for any sample size n.
means will look like the one shown below.
2. When the distribution of the original variable might not be
normal, a sample size of 30 or more is needed to use the
normal distribution to approximate the distribution of the
sample means. The larger the sample, the better the
approximation will be.

6-53 6-54
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004

An Example An Example (cont’d)


A. C. Neilsen reported that children between the ages of 2 and 5 The distribution of the means is shown below, with the
watch an average of 25 hours of television per week. Assume the appropriate area shaded.
variable is normally distributed and the standard deviation is 3
hours. If 20 children between the ages of 2 and 5 are randomly
selected, find the probability that the mean of the number of
hours they watch television will be greater than 26.3 hours.

Solution:
The z value is
X −µ 26 . 3 − 25 1 .3
Since the variable is approximately normally distributed, the z= = = = 1 . 94
σ / n 3 / 20 0 . 671
distribution of sample means will be approximately normal, with
a mean of 25. The standard deviation of the sample mean is The area between 0 and 1.84 is 0.4738. Since the desire area is
in the tail, subtract 0.4738 from 0.5. Hence 0.5-0.4738 =
σ 3
σX = = = 0 .671 0.0262, or 2.62%.
n 20
6-55 6-56
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004
Finite Population Correction Factor Finite Population Correction Factor

„ The formula for standard error of the mean is „ The correction factor is computed using the
accurate when the samples are drawn with following formula:
replacement or are drawn without
replacement from a very large or infinite
N −n
population. N −1
„ A correction factor is necessary for computing
the standard error of the mean for samples
where N is the population size and n is the
drawn without replacement from a finite
sample size.
population.

6-57 6-58
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004

Correction Factor Applied to Standard Error Correction Factor Applied to z Value

„ The standard error of the mean must be „ The standard error for the mean must be
multiplied by the correction factor to adjust it adjusted when it is included in the formula
for large samples taken from a small for calculating the z values.
population.

σ N −n X −µ
σX = ⋅ z=
n N −1 σ N −n

n N −1

6-59 6-60
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004
Normal Approximation Normal Approximation (cont’d)

The normal distribution is often used to solve problems that


involve the binomial distribution since, when n is large (say,
100), the calculations are too difficult to do by hand using the
binomial distribution.
As a rule of thumb, statisticians generally agree that the normal
distribution should be used only when n*p and n*q are both
greater than or equal to 5. (Note: q=1-p.) For example, if p is 0.3
and n is 10, then the normal distribution should not be used as
an approximation. On the other hand, if p = 0.5 and n = 10, then
the normal distribution can be used as an approximation. See
the figure in the next slide.

6-61 6-62
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004

A Correction for Continuity Procedure for Normal Approximation

Binomial Normal „ Step 1 Check to see whether the normal


approximation can be used.
When finding Use
„ Step 2 Find the mean µ and the standard
P( X = a ) P(a − 0.5 < X < a + 0.5) deviation σ.
P( X ≥ a ) P( X > a − 0.5) „ Step 3 Write the problem in probability
notation, using X.
P( X > a ) P( X > a + 0.5)
P( X ≤ a ) P( X < a + 0.5)
P( X < a ) P( X < a − 0.5)
6-63 6-64
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004
Procedure for Normal Approximation (cont’d.) An Example

„ Step 4 Rewrite the problem using the A magazine reported that 6% of American drivers read the
newspaper while driving. If 300 driver are selected at random,
continuity correction factor, and find the probability that exactly 25 say they read the newspaper
show the corresponding area under while driving. (Source: USA TODAY)
the normal distribution.
Solution:
„ Step 5 Find the corresponding z values.
Here, p = 0.06, q = 0.94, and n = 300.
„ Step 6 Find the solution. STEP 1: Check to see whether the normal approximation can be
used.
( The formula for the mean and standard deviation for the normal
np = (300)(0.06) = 18 nq = (300)(0.94) = 282
distribution are necessary for calculations. They are
Both results are greater than 5, the normal distribution
µ = np and σ = npq can be used.

6-65 6-66
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004

An Example (cont’d) An Example (cont’d)


STEP 2: Find the mean and standard deviation. STEP 5: Find the corresponding z values. Since 25 represents
any value between 24.5 and 25.5. Find both z values.
µ = np = ( 300 )( 0 . 06 ) = 18
σ = npq = ( 300 )( 0 . 06 )( 0 . 94 ) = 4 . 11 25.5 −18
z1 = = 1.82
4.11
STEP 3: Write the problem in probability notation: P(X = 25). 24.5 −18
z2 = = 1.58
STEP 4: Rewrite the problem by using the continuity correction 4.11
factor: P(24.5<X <25.5).
STEP 6: Find the solution. Find the corresponding areas in the
table: the area for z = 1.82 is 0.4656, and the area for z= 1.58 is
0.4429. Subtract the areas to get the approximate value:
0.4656 – 0.4429 = 0.0227
Hence, the probability that exactly 25 people read the newspaper
while driving is 2.27%.

6-67 6-68
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004
Summary Summary (cont’d.)

„ The normal distribution can be used to „ The normal distribution can be used to
describe a variety of variables, such as describe a sampling distribution of sample
heights, weights, and temperatures. means.
„ The normal distribution is bell-shaped, „ These samples must be of the same size and
unimodal, symmetric, and continuous; its randomly selected with replacement from the
mean, median, and mode are equal. population.
„ Mathematicians use the standard normal „ The central limit theorem states that as the
distribution which has a mean of 0 and a size of the samples increases, the distribution
standard deviation of 1. of sample means will be approximately
normal.
6-69 6-70
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004

Summary (cont’d.) Conclusions

„ The normal distribution can be used to „ The normal distribution can be used to
approximate other distributions, such as the approximate other distributions to simplify
binomial distribution. the data analysis for a variety of applications.
„ For the normal distribution to be used as an
approximation to the binomial distribution,
the conditions np ≥ 5 and nq ≥ 5 must be met.
„ A correction for continuity may be used for
more accurate results.

6-71 6-72
© Copyright McGraw-Hill 2004 © Copyright McGraw-Hill 2004

Вам также может понравиться