Вы находитесь на странице: 1из 25

PROBABILITY DISTRIBUTION:

DEFINITION:
A table or an equation that relates each outcome of a statistical experiment with its probability of
occurrence is called probability distribution.

Probability the value of random variable falls within a specified range it is called cumultative
probability.

PROBABILITY FUNCTION:

A function which assigns probabilities to the values of a random variable is called probability function.

 All the probabilities must be between 0 and 1 inclusive


 The sum of the probabilities of the outcomes must be 1.

If these two conditions aren't met, then the function isn't a probability function. There is no

be between 0 and 1.

EXAMPLE:

Here's an example probability distribution that results from the rolling of a


single fair die.

X 1 2 3 4 5 6 SUM
P(X) ⅙ ⅙ ⅙ ⅙ ⅙ ⅙ 6/6=1
All probability distributions can be classified as discrete probability distributions or as continuous
probability distributions, depending on whether they define probabilities associated with discrete
variables or continuous variables.

Discrete random variable:

Only a finite or countable number of outcomes. Ex: marital status ( single, married, divorced, widowed),
the number of infections an infant develops during his or her first year of life.

The Discrete Uniform Distribution;


The simplest probability distribution occurs when all of the values of a random variable occur with equal
probability. This probability distribution is called the uniform distribution.

Uniform Distribution. Suppose the random variable X can assume k different values. Suppose also that
the P(X = xk) is constant. Then,

P(X = xk) = 1/k

Example:
Suppose a die is tossed. What is the probability that the die will land on 5 ?

Solution: When a die is tossed, there are 6 possible outcomes represented by: S


= { 1, 2, 3, 4, 5, 6 }. Each possible outcome is a random variable (X), and each
outcome is equally likely to occur. Thus, we have a uniform distribution.
Therefore, the P(X = 5) = 1/6.

The term "uniform distribution" is also used to describe the shape of a graph that plots observed values
in a set of data. Graphically, when the observed values in a set of data are equally spread across the
range of the data set, the distribution is also called a uniform distribution. Graphically, a uniform
distribution has no clear peaks.

Another example,

if we roll a fair die, we assume that the probability of obtaining each number is
1
6
If X is the random variable (r.v.) “ the number showing” , the probability
distribution table is
The probability distribution function (p.d.f.) is
1
P(X=x) = 6 , x=1,2,3,4,5,6
The distribution with equal probabilities is called “uniform”
It is also possible to have a continuous uniform distribution where the r.v. can be any number in a given
interval.
The continuous uniform distribution is also called the rectangular distribution.

HOW TO GRAPH THE UNIFORM DISTRIBUTION:

A continuous distribution can't be illustrated with a histogram, because this would require
an infinite number of bars.

With the uniform distribution, all values over an interval (a, b) are equally likely to occur. As a result, the
graph that illustrates this distribution is a rectangle. The figure shows the uniform distribution defined
over the interval (0, 10).
The uniform distribution defined over the interval (0, 10)

The area of a rectangle equals the base times the height, or in mathematical terms,

The Binomial Distribution:

A binomial experiment is one that possesses the following properties:

1. The experiment consists of n repeated trials;


2. Each trial results in an outcome that may be classified as a success or a failure (hence the name,
binomial);
3. The probability of a success , denoted by p , remains constant from trial to trial and repeated
trials are independent;

The number of success X in n trials of a binomial experiment is called a binomial random variable.

The probability distribution of the random variable X is called a binomial distribution, and is given by the
formula:
P ( X )=nrC p x qn− x

Where

n = the number of trials

x = 0 , 1 , 2 , …………n

p = the probability of success in a single trial

q = the probability of failure in a single trial

P(X) the probability of successes in n binomial trials.

Examples of binomial experiments


 Tossing a coin 20 times to see how many tails occur.
 Asking 200 people if they watch ABC news.
 Rolling a die to see if a 5 appears.

Example:

A coin is tossed 10 times. What is the probability that exactly 6 heads will occur.

1. Success = "A head is flipped on a single coin"


2. p = 0.5
3. q = 0.5
4. n = 10
5. x=6

P(x=6) = 10C6 * 0.5^6 * 0.5^4 = 210 * 0.015625 * 0.0625 = 0.205078125

MEAN AND VARIANCE OF BINOMIAL DISTRIBUTION:


If p is the probability of success and q is the probability of failure in a binomial trial, then the
expected number of successes in n trials (i.e. the mean value of the binomial distribution) is

E ( X ) =μ=np
The variance of the binomial distribution is

V ( X )=σ 2=npq

STANDARD DEVIATION OF BINOMIAL DISTRIBUTION:


The standard deviation of binomial distribution is

σ =√ npq

EXAMPLE:
Find mean , variance and standard deviation for the numbers of sixes that appear
when rolling 30 dice.

Success=A six is rolled on a single die

1 5
p= ; q=
6 6

MEAN:
E ( X ) =np

30∗1
E ( X )= =5
6

VARIANCE:
V ( X )=npq

30∗1
∗5
6 25
V ( X )= =
6 6

STANDARD DEVIATION:
The standard deviation is the square root of variance = 2.04124(approx.)
POISSON DISTRIBUTION :
The Poisson distribution was developed by the French mathematician Simeon Denis Poisson in
1837.

The Poisson distribution is a discrete probability distribution for the counts of events that occur
randomly in a given interval of time (or space).

The Poisson random variable satisfies the following conditions:

1. The number of successes in two disjoint time intervals is independent.

2. The probability of a success during a small time interval is proportional to the entire length
of the time interval.

Apart from disjoint time intervals, the Poisson random variable also applies to disjoint regions of
space. The probability distribution of a Poisson random variable X representing the number of
successes occurring in a given time interval or a specified region of space is given by the

Formula:

e−μ μx
P(X) = x!
Where
x =0, 1, 2, 3…
e=2.71828 
μ= mean number of successes in the given time interval or region of space

Mean and Variance of Poisson distribution:

If μ is the average number of successes occurring in a given time interval or region in the Poisson
distribution, then the mean and the variance of the Poisson distribution are both equal to μ.

E(X) = μ

and

V(X) = σ2 = μ

Note: In a Poisson distribution, only one parameter, μ is needed to determine the probability of an


event.

Applications:
 the number of deaths by horse kicking in the Prussian army (first application)
 birth defects and genetic mutations
 rare diseases (like Leukemia, but not AIDS because it is infectious and so not independent) -
especially in legal cases
 car accidents
 traffic flow and ideal gap distance
 number of typing errors on a page
 hairs found in McDonald's hamburgers
 spread of an endangered animal in Africa
 failure of a machine in one month
 The number of meteors greater than 1 meter diameter that strike earth in a year
 The number of occurrences of the DNA sequence "ACGT" in a gene
 The number of patients arriving in an emergency room between 11 and 12 pm
Examples:

1. The average number of homes sold by the Acme Realty Company is 2 homes per
day. What is the probability that exactly 3 homes will be sold tomorrow?

Solution:

 This is a Poisson experiment in which we know the following:

 μ = 2; since 2 homes are sold per day, on average.


 x = 3; since we want to find the likelihood that 3 homes will be sold tomorrow.
 e = 2.71828; since e is a constant equal to approximately 2.71828.

We plug these values into the Poisson formula as follows:

P(x; μ) = (e-μ) (μx) / x! 


P(3; 2) = (2.71828-2) (23) / 3! 
P(3; 2) = (0.13534) (8) / 6 
P(3; 2) = 0.180 

Thus, the probability of selling 3 homes tomorrow is 0.180 .

2. Suppose you knew that the mean number of calls to a fire station on a weekday is
8. What is the probability that on a given weekday there would be 11 calls? This
problem can be solved using the following formula based on the Poisson
distribution:

Solution: 
Thus, the probability for given days is 0.072

Practice Example 1:

If electricity power failures occur according to a Poisson distribution with an average of 3


failures every twenty weeks, calculate the probability that there will not be more than one failure
during a particular week.

Practice Example 2:

A company makes electric motors. The probability an electric motor is defective is 0.01. What is
the probability that a sample of 300 electric motors will contain exactly 5 defective motors?

Cumulative Poisson Probability:

A cumulative Poisson probability refers to the probability that the Poisson random variable is
greater than some specified lower limit and less than some specified upper limit.

COMMULATIVE DISTRIBUTION FUNCTION

3. Suppose the average number of lions seen on a 1-day safari is 5. What is the
probability that tourists will see fewer than four lions on the next 1-day safari?

Solution: 

This is a Poisson experiment in which we know the following:

 μ = 5; since 5 lions are seen per safari, on average.


 x = 0, 1, 2, or 3; since we want to find the likelihood that tourists will see fewer than 4 lions;
that is, we want the probability that they will see 0, 1, 2, or 3 lions.
 e = 2.71828; since e is a constant equal to approximately 2.71828.

To solve this problem, we need to find the probability that tourists will see 0, 1, 2, or 3 lions. Thus,
we need to calculate the sum of four probabilities:

P(0; 5) + P(1; 5) + P(2; 5) + P(3; 5).

To compute this sum, we use the Poisson formula:

P(x < 3, 5) = P(0; 5) + P(1; 5) + P(2; 5) + P(3; 5)


P(x < 3, 5) = [ (e-5)(50) / 0! ] + [ (e-5)(51) / 1! ] + [ (e-5)(52) / 2! ] + [ (e-5)(53) / 3! ] 
P(x < 3, 5) = [ (0.006738)(1) / 1 ] + [ (0.006738)(5) / 1 ] + [ (0.006738)(25) / 2 ] + [ (0.006738)(125) / 6 ] 
P(x < 3, 5) = [ 0.0067 ] + [ 0.03369 ] + [ 0.084224 ] + [ 0.140375 ] 
P(x < 3, 5) = 0.2650

Thus, the probability of seeing at no more than 3 lions is 0.2650.

CONTINUOUS PROBABILITY DISTRIBUTION:


DEFINITION:

If a variable can take on any value between two specified values, it is called a continuous variable.

OR

If a random variable is a continuous variable, its probability distribution is called a continuous


probability distribution

DESCRIPTION:

Any variable can have two types of values. Either the values can be fix numbers which are also known as
discrete values or a specified range that is known as continuous values. Based on these types of values a
data set is defined as continuous or discrete. In continuous data type, the values can be lying anywhere
within the range that is specified. 

For Example:
 The time required to drive back from office to home is always continuous.
So the probability distribution over a random variable "Y" where "Y" takes
continuous values is termed as continuous probability distribution. 

FORMULA:

Continuous variables does not take discrete values. They do take continuous values such as 1.1, 2.33,

and so on. To find a random value in uniform continuous distribution, the probability of finding a value

in an interval [a, b] will be,

pp = 1b−a1b−a

If a probability density function is given as f(x)f(x) then the probability of finding the continuous variable

in the range [a, b][a, b] can be obtained by the formula:

P(a≤x≤b)=∫baf(x)dxP(a≤x≤b)=∫abf(x)dx

Example:

 Find the probability of getting a random number from the interval [2, 7][2, 7].

Solution: Probability of getting a number in [2, 7][2, 7] = 17−217−2 = 1515

Word Problems:
Problem 1: 

Find the probability of finding a random variable between [34, 88][34, 88].

Solution: 

To find the probability, the formula will be 1b−a1b−a.

b = 88, a = 34b = 88, a = 34
Probability Probability = 188−34188−34 = 154154

Problem 2: 

For a given probability function f(x) = x−2f(x) = x−2, if x ≥ 1x ≥ 1 find P(X ≤ 2)P(X ≤ 2).

Solution: 

The interval given is [1,2][1,2].

 P(X≤2)P(X≤2) = ∫21x−2∫12x−2

                   = x−1−121x−1−112

                   = −1x21−1x12 = −12−12 - −11−11

                   = 1212 = 0.50.5

Hence, Probability is 0.5.0.5.

Problem 3: 

Find P(2 ≤ P ≤ 4)P(2 ≤ P ≤ 4) for probability function f(x)f(x) = 2x−32x−3 for x ≥ 1x ≥ 1.

Solution: 

The given interval is [2, 4][2, 4].

P(2 ≤ P ≤ 4)P(2 ≤ P ≤ 4) = ∫422x−3dx∫242x−3dx
               
                            = [2x−3+1−3+1]42[2x−3+1−3+1]24 = [−x−2]42[−x−2]24
                              
                            = −[116−[116 -14]14] = 316316

The probability is 316316.

Problem 4: 

Find the expected value over the interval [5, 11][5, 11] for a uniform distribution.

Solution: 

The expected value in uniform distribution in an interval [a, b][a, b] can be found by b−a2b−a2.

E(X)E(X) = 11−5211−52 = 6262 = 33.
APPLICATIONS:

 Random angle distribution.


 Cauchy distribution
 Exponential distribution
 Beta distribution
 Gamma distribution.

Continuous probability distribution can be further classified as follows:

1. Normal distribution;
Standard normal
2. Student’s t distribution
3. Chi-square distribution

NORMAL DISTRIBUTION:
 The normal (or Gaussian) distribution is a very common distribution.A normal distribution is
completely defined by specifying its mean (the value of µ) and its variance (the value of σ2). The normal
distribution with mean µ and variance σ2 is written N(µ, σ2). Hence the distribution N(20, 25) has a
mean of 20 and a standard deviation of 5; remember that the second “coordinate” is the variance and
the variance is the square of the standard deviation. In other words,the Normal distribution describes
a special class of such distributions that are symmetric and can be described by two parameters

(i) µ = the mean of the distribution


(ii) σ = the standard deviation of the distribution

Changing the values of µ and σ alter the positions and shapes of the distributions

Formula:

The probability density of the Normal distribution is given by

1 2 2

F(x) = 2
e−(x−µ) / 2σ
√2σ π
PROPERTIES OF NORMAL DISTRIBTUION:

• Bell-shaped

• The mean, median, and mode are equal and located at the center of the distribution.

• Uni-modal (only 1 mode).

• Symmetrical about the mean.

• The curve is continuous – i.e., there are no gaps or holes. For each value of X, here is a corresponding
value of Y.

• The curve never touches the x-axis. Theoretically, no matter how far in either direction the curve
extends, it never meets the x-axis, but it gets increasingly closer to the x-axis.

• The total area under the normal distribution curve is equal to 1.00 or 100%.

• .50 or 50% of the area lies to the left of the mean.

• .50 or 50% of the area lies to the right of the mean

• P(a < z < b) is the area below the normal curve, above the z-axis between z = a and z = b

APPLICATIONS:

 Many classical statistical tests are based on the assumption that the data follow a normal
distribution. This assumption should be tested before applying these tests.

 In modeling applications, such as linear and non-linear regression, the error term is often
assumed to follow a normal distribution with fixed location and scale.
 The normal distribution is used to find significance levels in many hypothesis tests and
confidence intervals.

EXAMPLES:

The distribution of IQ scores is normal with a mean of 100 and standard deviation of 15.

SOLUTION:

In other words, μ=100μ=100 and σ=15σ=15. The probability density function is shown below.

Notice that the horizontal axis shows IQ score and the bell is centered at the mean of 100.

STANDARD NORMAL DISTRIBUTION:


A standard normal distribution is a normal distribution with zero mean ( ) and unit variance (
), given by the probability density function and function.

FORMULA:

x−μ
𝑧= σ

Standard Normal Distribution Table


A standard normal distribution table shows a cumulative probability associated with a particular z-
score.The distribution table provides the probability that a normally distributed random variable Z,
with mean equal to 0 and variance equal to 1, is less than or equal to z.

Examples:

1. If the random variable X is described by the distribution N(45, 0.000625) then


what is the transformation to obtain the standardized normal variable?

Solution

Here, µ = 45and σ2 = 0.000625so that σ = 0.025.

Hence,

Z = (X − 45)/0.025is the required transformation.

2. Let the random variable x be the time required to transport the product from
the manufacturing facility in China to the United States. Suppose that x is
normally distributed with a mean of 24 days and a standard deviation of 3
days. What is the probability that a shipment will arrive within 30 days—
P(x less than or equal to 30)?

Solution:

To determine the probability, you must first calculate the Zscore. With the information from the
example, you know that

x = 30
m = 24
s = 3

Substitute this information into the Z score formula, and standardize the value of 30, as shown
below.
STUDENT’S T DISTRIBUTION:
DEFINITION:

The t distribution (aka, Student’s t-distribution) is a probability distribution that is used


to estimate population parameters when the sample size is small and/or when the
population variance is unknown.
DESCRIPTION:

According to the central limit theorem, the sampling distribution of a statistic (like a sample mean)
will follow a normal distribution, as long as the sample size is sufficiently large. Therefore, when we
know the standard deviation of the population, we can compute a z-score, and use the normal
distribution to evaluate probabilities with the sample mean.

But sample sizes are sometimes small, and often we do not know the standard deviation of the
population. When either of these problems occur, statisticians rely on the distribution of the t
statistic(also known as the t score), whose values are given by:

t = [ x̅  - μ ] / [ s / sqrt( n ) ]

Where,

x̅ =sample mean
µ=proportional mean
s =standard deviation of sample
n=sample size
t= t distribution

PROPERTIES OF T DISTRIBUTION:

The t distribution has the following properties:


 The mean of the distribution is equal to 0 .
 The variance is equal to v / ( v - 2 ), where v is the degrees of freedom (see last section)
and v >2.
 The variance is always greater than 1, although it is close to 1 when there are many degrees
of freedom. With infinite degrees of freedom, the t distribution is the same as the standard
normal distribution.

GRAPHS:

FORMULA:

FORMULA FOR VARIANCE:

FORMULA FOR MEAN:


N
1
X́ = ∑x
N I =1 i

EXAMPLE:

Let T follow a t-distribution with r = 8 df.  What is the probability that the absolute value
of T is less than 2.306?
Solution. 
The probability calculation is quite similar to a calculation we'd have to make for a normal random
variable. First, rewriting the probability in terms of T instead of the absolute value of T, we get:
P(|T | < 2.306) = P(−2.306 < T < 2.306)
Then, we have to rewrite the probability in terms of cumulative probabilities that we can actually
find, that is:

P(|T | < 2.306) = P(T  < 2.306) − P(T  < −2.306)


Pictorially, the probability we are looking for looks something like this:

But the t-table doesn't contain negative t-values, so we'll have to take advantage of the symmetry of
the T distribution. That is:
P(|T | < 2.306) = P(T < 2.306) − P(T  > 2.306)
Can you find the necessary t-values on the t-table?

The t-table tells us that P(T < 2.306) = 0.975 and P(T  > 2.306) = 0.025. Therefore:


P(|T | < 2.306) = 0.975 − 0.025 = 0.95
CHI-SQUARE DISTRIBUTION:

DEFINITION:
In probability theory and statistics, the chi-squared distribution (also chi-square or χ²-
distribution) with k degrees of freedom is the distribution of a sum of the squares of k
independent standard normal random variables.
DESCRIPTION:

Suppose we conduct the following statistical experiment. We select a random sample of size n from


a normal population, having a standard deviation equal to σ. We find that the standard deviation in
our sample is equal to s. Given these data, we can define a statistic, called chi-square, using the
following

EQUATION:

Χ2 = [ ( n - 1 ) * s2 ] / σ2

The distribution of the chi-square statistic is called the chi-square distribution. The chi-square
distribution is defined by the following probability density function:

Y = Y0 * ( Χ2 ) ( v/2 - 1 ) * e-Χ2 / 2


where Y0 is a constant that depends on the number of degrees of freedom, Χ 2 is the chi-square
statistic, v = n - 1 is the number of degrees of freedom, and e is a constant equal to the base of the
natural logarithm system (approximately 2.71828). Y0 is defined, so that the area under the chi-
square curve is equal to one.

In the figure below, the red curve shows the distribution of chi-square values computed from all
possible samples of size 3, where degrees of freedom is n - 1 = 3 - 1 = 2. Similarly, the green curve
shows the distribution for samples of size 5 (degrees of freedom equal to 4); and the blue curve, for
samples of size 11 (degrees of freedom equal to 10).

GRAPH
PROPERTIES:

The chi-square distribution has the following properties:

 The mean of the distribution is equal to the number of degrees of freedom: μ = v.
 The variance is equal to two times the number of degrees of freedom: σ 2 = 2 * v
 When the degrees of freedom are greater than or equal to 2, the maximum value for Y
occurs when Χ2 = v - 2.
 As the degrees of freedom increase, the chi-square curve approaches a normal distribution.

EXAMPLE:

1. The Acme Battery Company has developed a new cell phone battery. On average, the
battery lasts 60 minutes on a single charge. The standard deviation is 4 minutes.
Suppose the manufacturing department runs a quality control test. They randomly
select 7 batteries. The standard deviation of the selected batteries is 6 minutes. What
would be the chi-square statistic represented by this test?

Solution

We know the following:

 The standard deviation of the population is 4 minutes.


 The standard deviation of the sample is 6 minutes.
 The number of sample observations is 7.

To compute the chi-square statistic, we plug these data in the chi-square equation, as shown below.

Χ2 = [ ( n - 1 ) * s2 ] / σ2 


Χ2 = [ ( 7 - 1 ) * 62 ] / 42 = 13.5
where Χ2 is the chi-square statistic, n is the sample size, s is the standard deviation of the sample,
and σ is the standard deviation of the population.

Problem 2

Let's revisit the problem presented above. The manufacturing department ran a quality control test, using
7 randomly selected batteries. In their test, the standard deviation was 6 minutes, which equated to a chi-
square statistic of 13.5.

Suppose they repeated the test with a new random sample of 7 batteries. What is the probability that the
standard deviation in the new test would be greater than 6 minutes?

Solution

We know the following:

 The sample size n is equal to 7.


 The degrees of freedom are equal to n - 1 = 7 - 1 = 6.
 The chi-square statistic is equal to 13.5 (see Example 1 above).

Given the degrees of freedom, we can determine the cumulative probability that the chi-square statistic
will fall between 0 and any positive value. To find the cumulative probability that a chi-square statistic falls
between 0 and 13.5, we enter the degrees of freedom (6) and the chi-square statistic (13.5) into the Chi-
Square Distribution Calculator. The calculator displays the cumulative probability: 0.96.

This tells us that the probability that a standard deviation would be less than or equal to 6 minutes is 0.96.
This means (by the subtraction rule) that the probability that the standard deviation would begreater
than 6 minutes is 1 - 0.96 or .04.

REFERENCES:

 https://people.richland.edu/james/lecture/m170/ch06-prb.html
 http://stattrek.com/probability-distributions/probability-distribution.aspx
 http://www.probabilityformula.org/continous-probability-distribution.html
 http://stattrek.com/probability-distributions/discrete-continuous.aspx?tutorial=stat
 http://www.math.uah.edu/stat/dist/Continuous.html
 http://stattrek.com/probability-distributions/t-distribution.aspx
 http://mathworld.wolfram.com/Studentst-Distribution.html
 https://en.wikipedia.org/wiki/Student%27s_t-distribution
 https://onlinecourses.science.psu.edu/stat414/node/175
 https://en.wikipedia.org/wiki/Chi-squared_distribution

 http://stattrek.com/probability-distributions/chi-square.aspx
 http://www.stats.ox.ac.uk/~marchini/teaching/L5/L5.notes.pdf
 http://www.intmath.com/counting-probability/13-poisson-probability-distribution.php
 http://stattrek.com/probability-distributions/poisson.aspx
 http://onlinestatbook.com/2/probability/poisson.html
 http://mathworld.wolfram.com/StandardNormalDistribution.html
 file:///C:/Users/KT/Desktop/L6.slides.pdf
 http://stattrek.com/probability-distributions/discrete-continuous.aspx?Tutorial=Stat
 http://bk.psu.edu/clt/Stat200/Chapter6_Printable.pdf
 http://www.itl.nist.gov/div898/handbook/eda/section3/eda3661.htm
 http://www3.ul.ie/~mlc/support/Loughborough%20website/chap21/21_2.pdf
 http://www.intmath.com/counting-probability/13-poisson-probability-distribution.php

Вам также может понравиться