Вы находитесь на странице: 1из 14

Introduction to Continuous Probability Distributions

In this lesson, we will discuss two of these: the uniform distribution and the normal
distribution. The normal distribution is particularly important because many of the
methods used in statistics are based on this distribution
Uniform Distribution
We learned that a continuous random variable has a set of possible values that is an
interval on the number line. It is not possible to assign a probability to each point in
the interval. Instead, the probability distribution of a continuous random variable X is
specified by a mathematical function f(x) called the probability density function or
just density function. The graph of a density function is a smooth curve. A probability
density function (pdf) must satisfy two conditions: (1) f(x) 0 for all real values of x
and (2) the total area under the density curve is equal to 1. The graphs of three density
functions are shown in Figure 11.1.
The probability that X lies in any particular interval is shown by the area under the
density curve and above the interval. The following three events are frequently
encountered: (1) X < a, the event that the random variable X assumes a value less than
a; (2) a < X < b, the event that the random variable X assumes a value between a and
b; and (3) X > b, the event that the random variable X is greater than b. We say that
we are interested in the lower tail probability for (1) and the upper tail probability
when using (3). The areas associated with each of these are shown in Figure 11.2.

Notice that the probability that a < X < b may be computed using tail probabilities:
P(a < X < b) = P(X < b) P(X < a).
If the random variable X is equally likely to assume any value in an interval (a, b),
then X is a uniform random variable. The pdf is flat and is above the x-axis between a
and b, and it is 0 outside of the interval. The height of the curve must be such that the
area under the density and above the x-axis is 1. Because this region is a rectangle, the
area is the height times the width of the interval, which is b a. Thus, the height must
be

; that is, the pdf of a uniform random variable has the form

= 0, otherwise.
A graph of the pdf is shown in Figure 11.3.

Example
A group of volcanologists (people who study volcanoes) has been monitoring a
volcano's seismicity, or the frequency and distribution of underlying earthquakes.
Based on these readings, they believe that the volcano will erupt within the next 24
hours, but the eruption is equally likely to occur any time within that period. What is
the probability that it will erupt within the next eight hours?

Solution
Define X = the time until the eruption of the volcano. X has positive probability over
the interval (0,24) because the volcano will erupt during that time interval. Because
the length of the interval is 24 0 = 24, the height of the density curve must be
the area under the density and above the x-axis to be one. That is, the pdf is

for

= 0, otherwise.
The probability that the volcano will erupt within the next eight hours is equal to the
area under the curve and above the interval (0,8) as shown in Figure 11.4. This area is
.
2

In the previous example, notice that the area is the same whether we have P(0 < X <
8) or P(0 X < 8) or P(0 < X 8) or P(0 X 8). Unlike discrete random variables,
whether the inequality is strict or not, the probability is the same for the continuous
random variables. This also correctly implies that, for continuous random variables,
the probability that the random variable equals a specific value is 0.

Normal Distribution
Normal Probability Distributions
Normal probability distributions are continuous probability distributions that are bell
shaped and symmetric. They are also known as Gaussian distributions or bell-shaped
curves.
The normal distribution is perhaps the most widely used probability distribution,
largely because it provides a reasonable approximation to the distribution of many
random variables. It also plays a central role in many of the statistical methods that
will be discussed in later lessons. Normal probability distributions are continuous
probability distributions that are bell shaped and symmetric as displayed in Figure
11.5. The distribution is also called the Gaussian distribution or the bell-shaped curve.

The normal distribution has two parameters: the mean and the standard deviation .
The notation X ~ N( ,) means that "X is normally distributed with a mean of and
a standard deviation of ". The distribution is symmetric about the mean. The mean,
median, and mode are all equal. The mean is often referred to as the location
parameter because it determines where the distribution is centered. The standard
deviation determines the spread of the distribution. The effect of the mean and
standard deviation on the normal distribution is displayed in Figure 11.6.

For any normal distribution, about 68% of the observations are within one standard
deviation of the mean. About 95% and 99.7% of the observations are, respectively,
within two and three standard deviations of the mean.

It is important to remember that, although the location and spread may change, the
area under the curve and above the x-axis is always 1. Unfortunately, the probabilities
associated with intervals cannot be computed easily as with the uniform distribution.
To overcome this difficulty, we rely on a table of areas for a reference of normal
distribution called the standard normal distribution. The standard normal distribution
is the normal distribution with = 0 and = 1. It is customary to use the letter z to
represent a standard normal random variable.
We will first learn to compute probabilities for a standard normal random variable and
then learn how to find them for any random variable. We will also want to be able to
determine extreme values of z, such as the value that only 5% of the population
exceeds or the value that 1% of the population is less than. To find either probabilities
or extreme values, we need a table of standard normal curve areas, or we need a
calculator or computer that can be used to find these values. Here, we will restrict
ourselves to the use of tables. The standard normal table used here in Table 11.1
tabulates the probability of observing a value less than or equal to z (see Figure 11.7).

Graphs are extremely useful tools to help us understand what values we are searching
for. We will do this for each problem we work.

Examples of Continuous Probability Distributions


Below are nine examples of continuous probability distributions problems and
solutions.
Example 1
Find P(z < 1.32).
Solution 1
Using the standard normal table, we find the row with 1.3 in the z column and move
along that row to the 0.02 column to find 0.9066. Thus, P(z < 1.32) = 0.9066. Figure
11.8 shows the graphic image of this.

Example 2
Find P(z > 1.32).
Solution 2
From the table, we find P(z < 1.32) as we did in the previous example. Using some of
the ideas of probability we learned earlier, we have P(z > 0.32) = 1 P(z 1.32) = 1
0.9066 = 0.0934. See Figure 11.9.

Example 3
Find P(z < 0.5).
Solution 3
There are no negative z-values in the table, so we cannot look this up directly. Instead,
we use the symmetry of the normal distribution to find the probability (see Figure
11.10). That is,
P(z < 0.5) =P(z > 0.5)
= 1 P(z < 0.5)
= 1 0.6915
= 0.3085

Example 4
Find P(1.45 < z < 0.76).
Solution 4
Figure 11.11 shows the solution.

First, we notice that P(1.45 < z < 0.76) = P(z < 0.76) P(z < 1.45). Now P(z <
0.76) can be found directly from the table to be 0.7764. Using the symmetry of the
normal distribution again, P(z < 1.45) = P(z > 1.45) = 1 P(z 1.45) = 1 0.9265 =
0.0735. Finally, P(1.45 < z < 0.76) = P(z < 0.76) P(z < 1.45) = 0.7764 0.0735 =
0.7029.
Example 5
Find the value z* such that P(z < z*) = 0.75.
Solution 5

This is different from the other problems we have considered. Instead of finding a
probability, we are looking for a z-value. However, the same table will allow us to
solve the problem. The difference is that we will look in the table for a probability and
then find the z-value associated with the probability. Looking in the body of the table,
we find the values 0.7486 and 0.7517, which are the closest to the 0.75 of interest. By
looking at the corresponding row and column headings, we find that P(z < 0.67) =
0.7486 and P(z < 0.68) = 0.7517. Because 0.7486 is closer to 0.75 than 0.7517, we
take z* = 0.67. (Note: We could interpolate to find a more precise value of z*, but we
will not go through this process here.) See Figure 11.12.

Example 6
Find the value z* such that P(z > z*) = 0.05.
Solution 6
We need to have the probabilities in the form P(z < z*) to use the table. However, P(z
> z*) = 1 P(z z*). We can rewrite this as P(z z*) = 1 P(z > z*) = 1 0.05 =
0.95. That is, if 5% of the population values are greater than z*, then 95% of the
population values must be less than or equal to z*. Thus, we look for 0.95 in the body
of the table and find 0.9495 and 0.9505 corresponding to z = 1.64 and z = 1.65,
respectively. Because 0.95 is exactly halfway between 0.9495 and 0.9505, we have z*
= 1.645. (This is the only time we don't just round to the closest value.) See Figure
11.13.

10

Example 7
Find the value z* such that P(z < z*) = 0.01.
Solution 7
Because the standard normal is symmetric about its mean 0, we know P(z < 0) = 0.5,
we know that z* must be less than 0. Also, because we have only positive values of z
in the table, we cannot look for 0.01 directly in the table. However, again because of
symmetry, we know that, if P(z < z*) = 0.01, then P(z > z*) = 0.01. To use the table,
we must find P(z z*) =1 P(z > z*) = 1 0.01 = 0.99. Looking in the body of the
table, we find 0.9898 and 0.9901, corresponding to z = 2.32 and z = 2.33,
respectively, to be the closest to 0.99. Because 0.9901 is the closer of the two to 2.33,
we find z* = 2.33. See Figure 11.14.

Few normal random variables actually have a standard normal distribution. However,
any normal random variable can be transformed to a standard normal, and any
standard normal random variable can be transformed to a normal random variable

11

with any mean and standard deviation . Specifically, if X ~ N(,)),


. Further, if z ~ N(0,1), then X = + z ~ N(,). Using these
relationships, we can find probabilities and extreme values for any normal random
variable using the z-table. When doing this, it is important to do all calculations
carefully.
Example 8
Let X ~ N(10,5). Find P(X < 15).
Solution 8

P(X < 15)


= P(z < 1)
= 0.8413.
Notice that inside the parentheses, we had to transform both the X and the 15 to avoid
changing the inequality. When working with X, we used symbols, and we used
numbers when working with 15. However, we used the numbers that were associated
with each symbol. Once we have the expression in terms of z, then the problem is
equivalent to the earlier ones we worked. See Figure 11.15.

Example 9
Let X, N(5,2). Find X* such that P(X > X*) = 0.05.
Solution 9

12

First, we find z* such that P(z > z*) = 0.05. From our earlier work, we know that z* =
1.645. Then X* = 1 z* = + 2(1.645) = 8.29.
Continuous Probabiity Distributions In Short
We have discussed two continuous distributions: the uniform and the normal. When
every value in an interval is equally likely to occur, then we have a uniform
distribution. The normal distribution is the most commonly used continuous
distribution. Probabilities associated with a normal random variable must be found by
using tables, calculators, or computers. When using tables, it is possible to use only
one table. By tradition, the probabilities of a standard normal distribution are
tabulated. Probabilities for other normal random variables are found by transforming
the problem to one on the standard normal.
Notes:

Continuous Probability Distributions

a random variable that can assume all possible random values (ie city
temperature)

Probability Density Function: a function that describes how likely


this random variable will occur at a given point.

Height formula: height = 1/(b-a) where b is the top range, and a is the
bottom range given.

The Normal Distribution

used to solve continuous probabilities


symmetry about the mean
total area under the curve is 1
standard deviation is the distance from the mean to the point of
inflection
Any normal distribution can be described as by the mean and the
variance: so we often write N(mean, variance) to describe a distribution
The distribution chart shows area under the graph from the X value to
the left end
13

Z-Scores can be calculated using Normal distributions

Z = x mean / standard deviation

Sometimes, you will have to subtract the mean to equalize. This


makes it so the mean is on the center.

14

Вам также может понравиться