Вы находитесь на странице: 1из 43

Introduction to Statistics

Mathematics

Chapter
Continuous Distributions

Chapter Topics

Gamma Distribution

Weibull Distribution

Exponential Distribution

The normal distribution

The standardized normal distribution

Evaluating the normality assumption

Continuous Probability
Distributions

Continuous random variable

Continuous probability distribution

Values from interval of numbers


Absence of gaps
Distribution of continuous random variable

Most important continuous probability


distribution

The normal distribution

Example

Let X be a random variable with range [0,2] and


pdf defined by f(x)=1/2 for all x between 0 and
2 and f(x)=0 for all other values of x. Note that
since the integral of zero is zero we get

f ( x)dx 1/ 2dx
0

1
x 1 0 1
2 0

That is, as with all continuous pdfs, the


total area under the curve is 1. We
might use this random variable to model
the position at which a two-meter with
length of rope breaks when put under
tension, assuming every point is
equally likely. Then the probability the
break occurs in the last half-meter
of the
2
rope is P(3/ 2 X 2) 2 f ( x)dx 2 1/ 2dx 1 x 1/ 4

3/ 2

3/ 2

3/ 2

Example

Let Y be a random variable whose range is the


nonnegative reals and whose pdf is defined by

1 x / 750
f ( x)
e
750

for nonnegative values of x (and 0 for negative


values of x). Then

f ( x)dx

lim e
t

x / 750 t
0

t
1 x / 750
e
dx lim e x / 750 dx
t 0
750

lim e 0 e 750 / t 1 0 1
t

Cumulative Distribution
Functions

In the second example above, F(x)=0 if x is


negative and for nonnegative x we have
F ( x)

1 t / 750
t / 750 x
e
dt e
e x / 750 1 1 e x / 750
0
750

Thus the probability of a light bulb lasting


between 500 and 1000 hours is

F (1000) F (500) (1 e 1000 / 750 ) (1 e 500 / 750 ) e 2 / 3 e 4 / 3 0.250

The random variable Y might be a


reasonable choice to model the lifetime in
hours of a standard light bulb with average
life 750 hours. To find the probability a bulb
lasts under 500 hours, you calculate
P(0 Y 500)

500

1 x / 750
x / 750 500
e
dx e
e 2 / 3 1 0.487
0
750

Gamma Random Variables

A continuous r.v.
whose density is
given by
e x x 1

,x 0
f ( x | , )

0

, 0

,x 0

Gamma function

e x x 1dx
0

It is easy to show by
induction that for
general n

n n 1!

And

( 12 )

( k ) t k 1e t dt
0

t k 1de t
0

k 1 t
0

t e

And

k 1 t k 1 1e t dt
0

Mean and Variance

Mean

E( X )

1
k k

k 1 1

dx

k 1 k 1
k k

E( X 2 ) k 1 k x k 2 1e x / dx
0

x k 1 1e x /
k 1 k 1

dx

0
1

x /

Similarly we have
that

k 2 k 2
k k

x k 2 1e x /
k 2 k 2

dx
0

k 1 k 2

Moment Gf

Mgf

MX(t )

1
k
k

k 1 1t x

x e

dx

M Xr t k ( k 1 )....( k r 1 )

r 1 t k r

1
k 1 k 1 t x
x e

1t k 1t k k
d 1 t x
0
1 t

The rth derivative

(k k r ) r 1 t

1
k

E X r

( k r )
k

k r

Exponential Distributions
P arrival time X 1 e

X : any value of continuous random variable


: the population average number of
arrivals per unit of time
1/: average time between arrivals
e 2.71828
e.g.: Drivers Arriving at a Toll
Bridge; Customers Arriving at an

Exponential Distributions

(continued
)

Describes time or distance between


events
Used for queues
x
Density function

1
f x e

Parameters

f(X)

= 0.5
= 2.0

The CDF of X is
x

F ( x ) 1 e t / dt
0

e t / d t /
0

t / x
0

1 e x /

No Memory Property

P( X a t | X a )

P ( X a t and X a )
P( X a )

P( X a t )
P( X a )

e a t
e a

P( X t )

Example
e.g.: Customers arrive at the check out
line of a supermarket at the rate of 30
per hour. What is the probability that
the arrival time between consecutive
customers to be greater than five
minutes?
30
X 5 / 60 hours

P arrival time >X 1 P arrival time X

1 1 e
.0821

30 5 / 60

Exponential Distribution
in PHStat

PHStat | probability & prob. Distributions


| exponential
Example in excel spreadsheet

Micros oft Excel


Works heet

Exponential Random
Variables
X: exponential RV with Proof:
P{min{ X , Y } t} P{ X t , Y t}
parameter
P{ X t}P{Y t}
e e e

Y: exponential RV with
P{min{ X , Y } t} 1 e
parameter
P{ X Y } f ( x, y ) dx dy
X, Y: independent
e e dx dy
Then:
e e dx dy
1. min{X, Y}: exponential
e (1 e ) dy
RV with parameter

e dy
( )e


2. P{X<Y} = /(+)

t t

( )t

( )t

0 0

XY

y
y

( ) y

dy

Weibull Distribution

A continuous r.v. X is said to have the


Weibull distribution with parameters
,>0 if it has a pdf of the form
f x, ,

1 x /

F x, , 1 e

x /

x0
x0

It follows that the 100 x pth percentile


has the form
F xp p
x p ln1 p

1/

The Mean and Variance

The Mean

E( X ) x
0

1 x /

dx

(1 ) 1 x /

t x / ; x t

E( X ) x
2

t x /

1/

(1 1 / ) 1 t

1 1 /

dx

The Variance

e dt

1 x /

( 2 ) 1 x /

dx
dx

; x t 1/

t (1 2 / ) 1e t dt

2 1 2 /

The Normal Distribution

Bell shaped
Symmetrical
Mean, median and
mode are equal
Interquartile range
equals 1.33
Random variable
has infinite range

f(X)

Mean
Median
Mode

The Mathematical Model


f X

1
2

2
X

2 2
f X : density of random variable X

3.14159;
e 2.71828
: population mean
: population standard deviation
X : value of random variable X

Expectation
E( X )

1
2
1
2

xe

1
2

( x ) 2 / 2 2

( x )e

dx

( x ) 2 / 2 2

( x ) 2 / 2 2

dx

d x

Variance

E X
2

1
2
2

2
2

x 2

) e

2 y2 / 2

y e

( x ) 2 / 2

dy

Many Normal Distributions


There are an infinite number of normal
distributions

By varying the parameters and , we


obtain different normal distributions

Finding Probabilities
Probability is
the area
under the
curve!

P c X d ?

f(X)

Which Table to Use?

An infinite number of normal distributions


means an infinite number of tables to look
up!

Standardized Normal
Distribution
Cumulative Standardized
Normal Distribution Table
(Portion)

.00

.01

Z 0

Z 1

.02
.5478

0.0 .5000 .5040 .5080

0.1 .5398 .5438 .5478


0.2 .5793 .5832 .5871
0.3 .6179 .6217 .6255

Probabilities

Only One Table is

0
Z = 0.12

Shaded
Area
Exaggerate
d

Standardizing Example
X 6.2 5
Z

0.12

10
Standardized
Normal
Distribution

Normal
Distribution

10

Z 1

6.2

Shaded Area Exaggerated

Z 0

0.12

Example:

P 2.9 X 7.1 .1664


X 2.9 5
Z

.21

10

X 7.1 5
Z

.21

10
Standardized
Normal
Distribution

Normal
Distribution

10

.0832

Z 1

.0832
2.9

7.1

Shaded Area
Exaggerated

0.21

Z 0

0.21

Example:

P 2.9 X 7.1 .1664(continued


Cumulative Standardized
Normal Distribution Table
(Portion)

.00

.01

Z 0

Z 1

.02

.5832

0.0 .5000 .5040 .5080


0.1 .5398 .5438 .5478

0.2 .5793 .5832 .5871


0.3 .6179 .6217 .6255

0
Z = 0.21

Shaded
Area
Exaggerate
d

Example:

P 2.9 X 7.1 .1664(continued


Cumulative Standardized
Normal Distribution Table
(Portion)

.00

.01

.02

Z 0

Z 1

.4168

-03 .3821 .3783 .3745


-02 .4207 .4168 .4129

-0.1 .4602 .4562 .4522


0.0 .5000 .4960 .4920

0
Z = -0.21

Shaded
Area
Exaggerate
d

Normal Distribution in
PHStat

PHStat | probability & prob. Distributions


| normal
Example in excel spreadsheet
Micros oft Excel
Works heet

Example:

P X 8 .3821
X 85
Z

.30

10
Standardized
Normal
Distribution

Normal
Distribution

10

Z 1

.3821

X
Shaded Area

Z 0

0.30

Example:

P X 8 .3821
Cumulative Standardized
Normal Distribution Table
(Portion)

.00

.01

Z 0

(continued
)

Z 1

.02

.6179

0.0 .5000 .5040 .5080


0.1 .5398 .5438 .5478

0.2 .5793 .5832 .5871


0.3 .6179 .6217 .6255

0
Z = 0.30

Shaded
Area
Exaggerate
d

Finding Z Values
for Known Probabilities
What is Z Given
Probability =
0.1217 ?

Z 0

Z 1

Cumulative Standardized
Normal Distribution Table
(Portion)

.00

.01

0.2

0.0 .5000 .5040 .5080

.6217

0.1 .5398 .5438 .5478


0.2 .5793 .5832 .5871
Shaded
Area
Exaggerat

Z .31

0.3 .6179 .6217 .6255

Recovering X Values
for Known Probabilities
Standardized
Normal
Distribution

Normal
Distribution

10

Z 1

.1179

.3821

Z 0

0.30

X Z 5 .30 10 8

Assessing Normality

Not all continuous random variables are


normally distributed
It is important to evaluate how well the
data set seems to be adequately
approximated by a normal distribution

Assessing Normality

Construct charts

(continued
)

For small- or moderate-sized data sets, do


stem-and-leaf display and box-and-whisker plot
look symmetric?
For large data sets, does the histogram or
polygon appear bell-shaped?

Compute descriptive summary measures

Do the mean, median and mode have similar


values?
Is the interquartile range approximately 1.33 ?
Is the range approximately 6 ?

Assessing Normality

Observe the distribution of

(continued
)
the
data

set

Do approximately 2/3 of the observations lie


1 standard deviation?
between mean
Do approximately 4/5 of the observations lie
between mean 1.28 standard deviations?
Do approximately 19/20 of the observations

lie between mean


2 standard deviations?

Evaluate normal probability plot

Do the points lie on or close to a straight line


with positive slope?

Assessing Normality

(continued
)

Normal probability plot

Arrange data into ordered array


Find corresponding standardized normal
quantile values
Plot the pairs of points with observed data
values on the vertical axis and the
standardized normal quantile values on the
horizontal axis
Evaluate the plot for evidence of linearity

Assessing Normality

(continued
)

Normal Probability Plot for


Normal Distribution
90

X 60
Z

30
-2 -1 0 1 2

Look for Straight

Normal Probability Plot


Left-Skewed

Right-Skewed

90

90

X 60

X 60
Z

30
-2 -1 0 1 2

-2 -1 0 1 2

Rectangular

U-Shaped

90

90

X 60

X 60
Z

30
-2 -1 0 1 2

30

30
-2 -1 0 1 2

Вам также может понравиться