Statistics Notes

Statistics
Statistics is a mathematical science pertaining to the collection, analysis, interpretation or

explanation, and presentation of data. Also with prediction and forecasting based on data.
It is applicable to a wide variety of academic disciplines, from the natural and social
sciences to the humanities, government and business.

The word statistics has been derived from the Latin word status. In the plural sense it means
a set of numerical figures called data obtained by counting, or, measurement. In the singular
sense it means collection, classification, presentation, analysis, comparison and meaningful
interpretation of raw data.
It has been defined in different ways by different authors. Croxton and Cowdon defined it as
the science which deals with the collection, analysis and interpretation of numerical data.
Statistical data help us to understand the economic problems, e.g., balance of trade, disparities
of income and wealth, national income accounts, supply and demand curves, living and whole
sale price index numbers, production, consumption, etc., formulate economic theories and test
old hypothesis. It also helps in planning and forecasting.
The success of modern business firms depends on the proper analysis of statistical data. Before
expansion and diversification of the existing business or setting up a new venture, the top
executives must analyse all facts like raw material prices, consumer-preferences, sales records,
demand of products, labour conditions, taxes, etc., statistically. It helps to determine the location
and size of business, introduce new products or drop an existing product and in fixing product
price and administration. It has also wide application in Operations Research.
LIMITATIONS OF STATISTICS AND ITS CHARACTERISTICS
(i) Statistics studies a group but not an individual.
(ii) Statistics cannot be applied to study the qualitative phenomenon.
(iii) Statistical decisions are true on an average only. For better results a large number of
observations are required.
(iv) Statistical data are not mathematically accurate.
(v) Statistical data must be analysed by statistical experts otherwise the results may be
misleading.
(vi) The laws of statistics are not exact like the laws of sciences. The first law states a
moderately to possess the characteristics of the large group. The second law states other
things being equal, as the sample size increases, the result tends to be more reliable and
accurate.
(vii) Statistics collected for a given purpose must be used for that purpose only.
(viii) Statistical relations do not always establish the cause and effect relationship between
phenomena.
(ix) A statistical enquiry has four phases, viz.,
(a) Collection of data;
(b) Classification and tabulation of data;
(c) Analysis of data;
(d) Interpretation of data.
FUNCTIONS OF STATISTICS
(i) It simplifies complex data and helps us to study the trends and relationships of different
phenomena and compare them.
(ii) It helps us to classify numerical data, measure uncertainty, test the hypothesis, formulate
policies and take valid inferences.

Types of Statistics
Descriptive Statistics are used to describe the data set
Examples: graphing, calculating averages, looking for extreme scores
Inferential Statistics allow you to infer something about the the parameters of the
population based on the statistics of the sample, and various tests we perform on the
sample
Examples: Chi-Square, T-Tests, Correlations, ANOVA

Measure Central Tendency
A way of summarising data using the value which is most typical. Three examples are the
Mean, Median and Mode.
The three most commonly-used measures of central tendency are the following.
Mean
The sum of the values divided by the number of values--often called the "average."
- Add all of the values together.
- Divide by the number of values to obtain the mean.
Example: The mean of 7, 12, 24, 20, 19 is (7 + 12 + 24 + 20 + 19) / 5 = 16.4.
Median
The value which divides the values into two equal halves, with half of the values being
lower than the median and half higher than the median.
- Sort the values into ascending order.
- If you have an odd number of values, the median is the middle value.
- If you have an even number of values, the median is the arithmetic mean (see above) of
the two middle values.
Example: The median of the same five numbers (7, 12, 24, 20, 19) is 19.
Mode
The most frequently-occurring value (or values).
- Calculate the frequencies for all of the values in the data.
- The mode is the value (or values) with the highest frequency.
Example: For individuals having the following ages -- 18, 18, 19, 20, 20, 20, 21, and 23, the
mode is 20.

Dispersion
Measurements of central tendency (mean, mode and median) locate the distribution within the
range of possible values, measurements of dispersion describe the spread of values.
The dispersion of values within variables is especially important in social and political research
because:
- Dispersion or "variation" in observations is what we seek to explain.
- Researchers want to know WHY some cases lie above average and others below average
for a given variable:
o TURNOUT in voting: why do some states show higher rates than others?
o CRIMES in cities: why are there differences in crime rates?
o CIVIL STRIFE among countries: what accounts for differing amounts?
- Much of statistical explanation aims at explaining DIFFERENCES in observations -- also
known as
o VARIATION, or the more technical term, VARIANCE.

Range
The range is the simplest measure of dispersion. The range can be thought of in two ways.
As a quantity: the difference between the highest and lowest scores in a distribution.
Variance and Standard Deviation
By far the most commonly used measures of dispersion in the social sciences are
variance and standard deviation. Variance is the average squared difference of scores
from the mean score of a distribution. Standard deviation is the square root of the
variance.
In calculating the variance of data points, we square the difference between each point
and the mean because if we summed the differences directly, the result would always be
zero. For example, suppose three friends work on campus and earn $5.50, $7.50, and $8
per hour, respectively. The mean of these values is $(5.50 + 7.50 + 8)/3 = $7 per hour. If
we summed the differences of the mean from each wage, we would get (5.50-7) + (7.50-
7) + (8-7) = -1.50 + .50 + 1 = 0. Instead, we square the terms to obtain a variance equal to
2.25 + .25 + 1 = 3.50. This figure is a measure of dispersion in the set of scores.
The variance is the minimum sum of squared differences of each score from any number.
In other words, if we used any number other than the mean as the value from which each
score is subtracted, the resulting sum of squared differences would be greater. (You can
try it yourself -- see if any number other than 7 can be plugged into the preceeding
calculation and yield a sum of squared differences less than 3.50.)

The standard deviation is simply the square root of the variance. In some sense, taking the
square root of the variance "undoes" the squaring of the differences that we did when we
calculated the variance.
Variance and standard deviation of a population are designated by and ,
respectively. Variance and standard deviation of a sample are designated by s
2
and s,
respectively.

Variance Standard Deviation
Population

Sample

In these equations, is the population mean, is the sample mean, N is the total
number of scores in the population, and n is the number of scores in the sample.
Coefficient of Variation
This is the ratio of the standard deviation to the mean:

The coefficient of variation describes the magnitude sample values and the variation
within them.
Co-efficient Of Variation ( C. V. )
To compare the variations ( dispersion ) of two different series, relative measures of standard
deviation must be calculated. This is known as co-efficient of variation or the co-efficient of s. d.
Its formula is
C. V. =
Thus it is defined as the ratio s. d. to its mean.
Remark: It is given as a percentage and is used to compare the consistency or variability of two
more series. The higher the C. V. , the higher the variability and lower the C. V., the higher is the
consistency of the data.
Correlations
The correlation is one of the most common and most useful statistics. A correlation is a single
number that describes the degree of relationship between two variables
Purpose (What is Correlation?) Correlation is a measure of the relation between two or more
variables. The measurement scales used should be at least interval scales, but other correlation
coefficients are available to handle other types of data. Correlation coefficients can range from -
1.00 to +1.00. The value of -1.00 represents a perfect negative correlation while a value of +1.00
represents a perfect positive correlation. A value of 0.00 represents a lack of correlation.
How to Interpret the Values of Correlations. As mentioned before, the correlation
coefficient (r) represents the linear relationship between two variables. If the correlation
coefficient is squared, then the resulting value (r
2
, the coefficient of determination) will represent
the proportion of common variation in the two variables (i.e., the "strength" or "magnitude" of
the relationship). In order to evaluate the correlation between variables, it is important to know
this "magnitude" or "strength" as well as the significance of the correlation.
In statistics, Spearman's rank correlation coefficient named after Charles Spearman and often
denoted by the Greek letter (rho) or as r
s
, is a non-parametric measure of correlation that is, it
assesses how well an arbitrary monotonic function could describe the relationship between two
variables, without making any assumptions about the frequency distribution of the variables.
Correlation

Correlation can be easily understood as co relation. To define. correlation is the average
relationship between two or more variables. When the change in one variable makes or causes a
change in other variable then there is a correlation between these two variables.

These correlated variables can move in the same direction or they can move in opposite
direction. Not always there is a cause and effect relationship between the variables when there is
a change; that might be due to uncertain change.

Simple Correlation is a correlation between two variables only; meaning the relationship
between two variables. Event correlation and simple event correlation are the types of
correlations mainly used in the industry point of view.
Types of Correlation

In Research Methodology of the Management, Correlation is broadly classified into six types as
follows :

(1) Positive Correlation
(2) Negative Correlation
(3) Perfectly Positive Correlation
(4) Perfectly Negative Correlation
(5) Zero Correlation
(6) Linear Correlation

Positive Correlation

When two variables move in the same direction then the correlation between these two variables
is said to be Positive Correlation.
When the value of one variable increases, the value of other value also increases at the same rate.

For example the training and performance of employees in a company.
Negative Correlation

In this type of correlation, the two variables move in the opposite direction.
When the value of a variable increases, the value of the other variable decreases.

For example, the relationship betwwen price and demand.
Perfect Positive Correlation

When there is a change in one variable, and if there is equal proportion of change in the other
variable say Y in the same direction,
then these two variables are said to have a Perfect Positive Correlation.
Perfectly Negative Correlation

Between two variables X and Y, if the change in X causes the same amount of change in Y in
equal proportion but in opposite direction,
then this correlation is called as Perfectly Negative Correlation.
Zero Correlation

When the two variables are independent and the change in one variable has no effect in other
variable,
then the correlation between these two variable is known as Zero Correlation.
Linear Correlation

If the quantum of change in one variable has a ratio of change in the quantum of change in the
other variable then it is known as Linear correlation.

Degrees of Correlation
Through the coefficient of correlation, we can measure the degree or extent of the correlation
between two variables. On the basis of the coefficient of correlation we can also determine
whether the correlation is positive or negative and also its degree or extent.
1. Perfect correlation: If two variables changes in the same direction and in the same
proportion, the correlation between the two is perfect positive. According to Karl
Pearson the coefficient of correlation in this case is +1. On the other hand if the variables
change in the opposite direction and in the same proportion, the correlation is perfect
negative. its coefficient of correlation is -1. In practice we rarely come across these types
of correlations.
2. Absence of correlation: If two series of two variables exhibit no relations between them
or change in variable does not lead to a change in the other variable, then we can firmly
say that there is no correlation or absurd correlation between the two variables. In such
a case the coefficient of correlation is 0.
3. Limited degrees of correlation: If two variables are not perfectly correlated or is there a
perfect absence of correlation, then we term the correlation as Limited correlation. It may
be positive, negative or zero but lies with the limits 1.
High degree, moderate degree or low degree are the three categories of this kind of correlation.
The following table reveals the effect ( or degree ) of coefficient or correlation
Methods Of Determining Correlation
We shall consider the following most commonly used methods.(1) Scatter Plot (2) Kar Pearsons
coefficient of correlation (3) Spearmans Rank-correlation coefficient.
1) Scatter Plot ( Scatter diagram or dot diagram ): In this method the values of the two
variables are plotted on a graph paper. One is taken along the horizontal ( (x-axis) and the other
along the vertical (y-axis). By plotting the data, we get points (dots) on the graph which are
generally scattered and hence the name Scatter Plot.

i) If all points lie on a rising straight line the correlation is perfectly positive and r = +1 (see fig.1
)
ii) If all points lie on a falling straight line the correlation is perfectly negative and r = -1 (see
fig.2)
iii) If the points lie in narrow strip, rising upwards, the correlation is
high degree of positive (see fig.3)
iv) If the points lie in a narrow strip, falling downwards, the
correlation is high degree of negative (see fig.4)
v) If the points are spread widely over a broad strip, rising upwards,
the correlation is low degree positive (see fig.5)
vi) If the points are spread widely over a broad strip, falling
downward, the correlation is low degree negative (see fig.6)
vii) If the points are spread (scattered) without any specific pattern,
the correlation is absent. i.e. r = 0. (see fig.7)

2) Karl Pearsons coefficient of correlation: It gives the numerical expression for the measure
of correlation. it is noted by r . The value of r gives the magnitude of correlation and sign
denotes its direction. It is defined as
r =

where
N = Number of pairs of observation

Linear regression is a form of regression analysis in which the relationship between
one or more independent variables and another variable, called dependent variable, is modeled
by a least squares function, called linear regression equation. This function is a linear
combination of one or more model parameters, called regression coefficients. A linear regression
equation with one independent variable represents a straight line. The results are subject to
statistical analysis.
From algebra, any straight line can be described as:
Y = a + bX, where a is the intercept and b is the slope

Linear Regression
Correlation gives us the idea of the measure of magnitude and direction between correlated
variables. Now it is natural to think of a method that helps us in estimating the value of one
variable when the other is known. Also correlation does not imply causation. The fact that the
variables x and y are correlated does not necessarily mean that x causes y or vice versa. For
example, you would find that the number of schools in a town is correlated to the number of
accidents in the town. The reason for these accidents is not the school attendance; but these two
increases what is known as population. A statistical procedure called regression is concerned
with causation in a relationship among variables. It assesses the contribution of one or more
variable called causing variable or independent variable or one which is being caused
(dependent variable). When there is only one independent variable then the relationship is
expressed by a straight line. This procedure is called simple linear regression.
Regression can be defined as a method that estimates the value of one variable when that of
other variable is known, provided the variables are correlated. The dictionary meaning of
regression is "to go backward." It was used for the first time by Sir Francis Galton in his research
paper "Regression towards mediocrity in hereditary stature."
Lines of Regression: In scatter plot, we have seen that if the variables are highly correlated then
the points (dots) lie in a narrow strip. if the strip is nearly straight, we can draw a straight line,
such that all points are close to it from both sides. such a line can be taken as an ideal
representation of variation. This line is called the line of best fit if it minimizes the distances of
all data points from it.
This line is called the line of regression. Now prediction is easy because now all we need to do
is to extend the line and read the value. Thus to obtain a line of regression, we need to have a line
of best fit. But statisticians dont measure the distances by dropping perpendiculars from points
on to the line. They measure deviations ( or errors or residuals as they are called) (i) vertically
and (ii) horizontally. Thus we get two lines of regressions as shown in the figure (1) and (2).
(1) Line of regression of y on x
Its form is y = a + b x
It is used to estimate y when x is given
(2) Line of regression of x on y
Its form is x = a + b y
It is used to estimate x when y is given.
They are obtained by (1) graphically - by Scatter plot (ii)
Mathematically - by the method of least squares.
Regression can be used for prediction (including forecasting of time-series data), inference,
hypothesis testing, and modeling of causal relationships

What is the difference between correlation and linear regression?
Correlation and linear regression are not the same. Consider these differences:
- Correlation quantifies the degree to which two variables are related. Correlation does not
find a best-fit line (that is regression). You simply are computing a correlation coefficient
(r) that tells you how much one variable tends to change when the other one does.
- With correlation you don't have to think about cause and effect. You simply quantify how
well two variables relate to each other. With regression, you do have to think about cause
and effect as the regression line is determined as the best way to predict Y from X.
- With correlation, it doesn't matter which of the two variables you call "X" and which you
call "Y". You'll get the same correlation coefficient if you swap the two. With linear
regression, the decision of which variable you call "X" and which you call "Y" matters a
lot, as you'll get a different best-fit line if you swap the two. The line that best predicts Y
from X is not the same as the line that predicts X from Y.
- Correlation is almost always used when you measure both variables. It rarely is
appropriate when one variable is something you experimentally manipulate. With linear
regression, the X variable is often something you experimentall manipulate (time,
concentration...) and the Y variable is something you measure.
Probable Error
It is used to help in the determination of the Karl Pearsons coefficient of correlation r . Due to
this r is corrected to a great extent but note that r depends on the random sampling and its
conditions. it is given by

P. E. = 0.6745
i. If the value of r is less than P. E., then there is no evidence of correlation i.e. r is not
significant.
ii. If r is more than 6 times the P. E. r is practically certain .i.e. significant.
iii. By adding or subtracting P. E. to r , we get the upper and Lower limits within which
r of the population can be expected to lie.
Symbolically e = r P. E.
P = Correlation ( coefficient ) of the population.

BASIC PROBABILITY

1.0 INTRODUCTION
Probability concepts are familiar to everyone. The weather forecaster states that the
probability of rain tomorrow is twenty percent. At the racetrack, the odds are three to
one that a certain horse will win the fifth race. Relating probability concepts to
manufacturing operations may not be as familiar as the above examples, but they work
the same way. Probability is the key to assessing the risks involved in the decision-
making process. The gambling casinos determine the probabilities for each game of
chance then make the rules so that the odds are always in their favor. The same can be
done for manufactured products. The probability of a certain number of defective parts
in a large lot can be determined. Also, the percentage of parts within a certain
dimension range can be predicted. If the desired results are not obtained, then
adjustments to the process can be made. Adjustments to a process in a manufacturing
operation are analogous to changing the rules in a casino game. The objective is to
obtain the desired results.
Since a major portion of statistical quality control and statistical process control deals
with probability concepts, it is important to have a good knowledge of probability. In a
manufacturing operation, there are very few occasions when complete information is
available. Therefore, information must be generalized from samples and limited known
facts. It is sometimes surprising to discover the vast amount of information and
knowledge about a process that can be obtained from a relatively small amount of data.
Probability is the building block of statistics and statistical quality control.

2.0 EVENTS
An event is defined as any outcome that can occur. There are two main categories of
events: Deterministic and Probabilistic.
A deterministic event always has the same outcome and is predictable 100% of the
time.
- Distance traveled = time x velocity
- The speed of light
- The sun rising in the east
- James Bond winning the fight without a scratch
A probabilistic event is an event for which the exact outcome is not predictable 100% of
the time.
- The number of heads in ten tosses of a coin
- The winner of the World Series
- The number of games played in
a World Series
- The number of defects in a batch of product
In a boxing match there may be three possible events. (There could be more depending
on the question asked.)
- Fighter A wins
- Fighter B wins
- Draw
2.1 Four Basic Types of Events
- Mutually Exclusive Events: These are events that cannot occur at the same
time. The cause of mutually exclusive events could be a force of nature or a man
made law. Being twenty-five years old and also becoming president of the United
States are mutually exclusive events because by law these two events cannot
occur at the same time.
- Complementary Events: These are events that have two possible outcomes.
The probability of event A plus the probability of A' equals one. P(A) + P(A') = 1.
Any event A and its complementary event A' are mutually exclusive. Heads or
tails in one toss of a coin are complementary events.
- Independent Events: These are two or more events for which the outcome of
one does not affect the other. They are events that are not dependent on what
occurred previously. Each toss of a fair coin is an independent event.
- Conditional Events: These are events that are dependent on what occurred
previously. If five cards are drawn from a deck of fifty-two cards, the likelihood of
the fifth card being an ace is dependent on the outcome of the first four cards.

3.0 PROBABILITY
Probability is defined as the chance that an event will happen or the likelihood that an
event will happen.
The definition of probability is

The favorable events are the events of interest. They are the events that the question is
addressing. The total events are all possible events that can occur relevant to the
question asked. In this definition, favorable has nothing to do with something being
defective or non-defective.
What is the probability of a head occurring in one toss of a coin?
The number of favorable events is 1 (one head) and the number of total events is 2
(head or tail). In this case, the probability formula verifies what is obvious.

Probability numbers always range from 0 to 1 in decimals or from 0 to 100 in
percentages.
3.1 Notation for Probability Questions
Instead of writing out the whole question, the following notation is used.
- What is the probability of event A occurring? = Probability (A) = P(A)
- What is the probability of events A and B occurring? = P(A and B) = P(A) and
P(B)
- What is the probability of events A or B occurring? = P(A or B) = P(A) or P(B)
3.2 Probability in Terms of Areas
Probability may also be defined in terms of areas rather than the number of
events.

Example 1
A plane drops a parachutist at random on a seven by five mile field. The field
contains a two by one mile target as shown below. What is the probability that
the parachutist will land in the target area? Assume that the parachutist drops
randomly and does not steer the parachute.

4.0 METHODS TO DETERMINE PROBABILITY VALUES
There are three major methods used to determine probability values.
- Subjective Probability: This is a probability value based on the best available
knowledge or maybe an educated guess. Examples are betting on horse races,
selecting stocks or making product-marketing decisions.
- Priori Probability: This is a probability value that can be determined prior to any
experimentation or trial. For example, the probability of obtaining a tail in tossing
a coin once is fifty percent. The coin is not actually tossed to determine this
probability. It is simply observed that there are two faces to the coin, one of which
is tails and that heads and tails are equally likely.
- Empirical Probability: This is a probability value that is determined by
experimentation. An example of this is a manufacturing process where after
checking one hundred parts, five are found defective. If the sample of one
hundred parts was representative of the total population, then the probability of
finding a defective part is .05 (5/100). The question may be asked: How is it
known that this sample is representative of the total population? If repeated trials
average .05 defective, with little variation between trials, then it can be said that
the empirical probability of a defective part is .05.

5.0 MULTIPLICATION THEOREM
The multiplication theorem is used to answer the following questions:
- What is the probability of two or more events occurring either simultaneously or
in succession?
- For two events A and B: What is the probability of event A and event B
occurring?
The individual probability values are simply multiplied to arrive at the answer. The word
"and" is the key word that indicates multiplication of the individual probabilities. The
multiplication theorem is applicable only if the events are independent. It is not valid
when dealing with conditional events. The product of two or more probability values
yields the intersection or common area of the probabilities. The intersection is illustrated
by the Venn diagrams in section 11.0 of this chapter. Mutually exclusive events do not
have an intersection or common area. The probability of two or more mutually exclusive
events is always zero.
For mutually exclusive events:
- P(A) and P(B) = 0
For independent events:
- Probability (A and B) = P(A) and P(B) = P(A) X P(B)
For multiple independent events, the multiplication formula is extended. The probability
that five events A, B, C, D and E occur is
P(A) and P(B) and P(C) and P(D) and P(E) = P(A) x P(B) x P(C) x P(D) x P(E)

Example 2
What is the probability of getting a raise and that the sun will shine tomorrow?
Given: Probability of getting a raise = P(r) = .10
Probability of the sun shining = P(s) = .30
The events are independent.
P(raise) and P(sunshine) = P(r) x P(s) = .10 x .30 = .03 or 3%

6.0 ADDITION THEOREM
The addition theorem is used to answer the following questions:
- What is the probability of one event or another event or both events occurring?
- What is the probability of event A or event B occurring?
The word "or" indicates addition of the individual probabilities. The answers to the above
questions are different depending on whether the events are mutually exclusive or
independent.
Mutually exclusive events do not have an intersection or common area. The individual
probabilities are simply added to arrive at the answer. For mutually exclusive events:
- P(A or B) = P(A) or P(B) = P(A) + P(B)
- P(A or B or C or D) = P(A) + P(B) + P(C) + P(D)
For two independent events, the intersecting or common area must be subtracted or it
will be included twice. (Refer to the Venn diagram in section 11.0).
Probability (A or B) = P(A) or P(B) = P(A) + P(B) P(A X B)
For three independent events:
P(A or B or C) = P(A) + P(B) + P(C) P(A X B) P(A X C) P(B X C) + P(A X B
X C)

Example 3
What is the probability of getting a raise or that the sun will shine tomorrow?
Given: Probability of getting a raise = P(r) = .10
Probability of the sun shining = P(s) = .30
P(raise) or P(sunshine) = P(r) or P(s) = P(r or s)
P(r or s) = P(r) + P(s) - [P(r) x P(s)] = .10 + .30 - [.10 X .30] = .40 - .03 =
.37 or 37%
The word "and" is associated with the multiplication theorem and the word "or" is
associated with the addition theorem.

7.0 COUNTING TECHNIQUES - PERMUTATIONS AND COMBINATIONS
Permutations and combinations are simply mathematical tools used for counting. In
many cases, it may be cumbersome to count the number of favorable events or the
number of total events when solving probability problems. Permutations and
combinations help simplify the task.
7.1 Permutations
A permutation is an arrangement of things, objects or events where the order is
important. Telephone numbers are special permutations of the numerals 0 to 9
where each numeral may be used more than once. The order defines each
unique telephone number.
In the following example, it is assumed that each object is unique and cannot be
used more than once. The letters A, B, and C may be arranged in the following
ways:
ABC BAC CAB
ACB BCA CBA
This is an ordered arrangement, because ABC is different than BCA. Since the
order of the letters makes a difference, each arrangement is a permutation. From
the above example, It is concluded that there are six permutations that can be
made from three objects. The general formula for permutations is
n
P
r
=
n = The total objects to arrange
r = The number of objects taken from the total to be used in the
arrangements
By definition: 0! = 1 and 1! = 1

Example 4
Using the permutation formula and the three letters A, B and C, how many
permutations can be made using all three letters?

Example 5
How many permutations can be made by using two out of the three letters?

The permutations are
AB BA BC
AC CA CB
Example 6
There are three different assembly operations to be performed in making a
certain part. There are nine people working on the floor. How many different
assembly crews can be formed?

This may be stated as the number of permutations that can be made from nine
objects used three at a time.

7.2 Combinations
A combination is a grouping or arrangement of objects where the order does not
make a difference.
The arrangement of the letters ABC is the same as BCA. The number of
combinations that can be made by using three letters, three at a time, is one.
This can be expanded to state that the number of combinations that can be made
by using n letters, n at a time, is one. A hand of five cards consisting of a Jack, a
Queen, a King, and two Aces is the same as a Queen, two Aces, a Jack and a
King. The order in which the cards were received makes no difference. There is
only one combination that can be made by using five cards, five at a time.
The formula for combinations is

n = Total objects to arrange
r = Number of objects taken from the total to be used in the arrangements
The symbol for number of combinations is often shown as

When the symbol appears in a formula, the number of combinations is to be
computed using the combination formula.

Example 7
From the three letters A, B and C, how many combinations can be made by
using two out of the three letters?

The combinations are
AB AC BC
BA is the same as AB
CA is the same as AC
CB is the same as BC
Example 8
Ten parts have been manufactured. Two parts are to be inspected for a critical
dimension. How many different sample arrangements can be made?
If the parts are labeled 1 to 10, then parts 1 and 5 make one arrangement, parts
3 and 7 make another, 6 and 8 another, etc. The listing of the various
arrangements can be completed and total arrangements counted. The
combination formula can perform this task and save a considerable amount of
time.
The total arrangements or combinations that can be made:

The permutation and combination formulas are very useful tools in evaluating
and solving probability problems. It is often necessary to count the number of
favorable and total events that can occur. Without these counting techniques, this
would be a very cumbersome and sometimes impossible task.

8.0 PROBABILITY DISTRIBUTIONS
Probability distributions and their associated formulas and tables allow us to solve a
wide variety of problems in a logical manner. Probability distributions are classified as
discrete or continuous. Three discrete distributions will be reviewed in this chapter.
Continuous distributions are covered in the next chapter. Probability distributions are
used to generate sampling plans, predict yields, arrive at process capabilities,
determine the odds in games of chance and many other applications.
The three discrete distributions that will be reviewed:
- The Hypergeometric Probability Distribution
- The Binomial Probability Distribution
- The Poisson Probability Distribution
One of the most difficult tasks for a beginning student in probability is to know which
distribution or formula to use for a specific problem. A roadmap is given in section 10.0
of this chapter to assist in the task.
The quality engineer may be asked to calculate the probability of the number of defects
or the number of defective units in a sample. There is a difference between the two
phrases. A defect is an individual failure to meet a requirement. A defective unit is a unit
of product that contains one or more defects. Many defects can occur on one defective
unit.
8.1 The Hypergeometric Probability Distribution
The hypergeometric distribution is the basic distribution of probability. The
hypergeometric probability formula is simply the number of favorable events
divided by the number of total events. It can be described as the true basic
probability distribution of attributes. To use the hypergeometric formula, the
following values must be known.
N = The total number of items in the population (lot size)
n = The number of items to be selected from the population (sample size)
A = The number in the population having a given characteristic
B = The number in the population having another characteristic
a = The number of A that is desired to occur
b = The number of B that is desired to occur

The hypergeometric probability formula is

Example 9
An urn contains fifteen balls, five red and ten green. What is the probability of
obtaining exactly two red and three green balls in drawing five balls without
replacement?
This question may also be stated as:
- What is the probability of obtaining two red balls?
- What is the probability of obtaining three green balls?
All three questions are the same. When setting up the problem, all events must
be considered regardless of how the question is asked.
In this case, the probability of a single event is not constant from trial to trial. This
is the same as sampling without replacement. The outcome of the second draw
will be affected by what was obtained on the first draw. The number of favorable
events and the number of total events must be computed.
The number of ways that red balls may be selected:

The number of ways that green balls may be selected:

The total number of ways to select a sample of five balls from a population of
fifteen balls:

This is a specific application of the hypergeometric probability formula. Many
similar problems may be solved using this method. To use the hypergeometric
formula, the population must be small enough so that the number of items with
the characteristics in question can be determined.
Example 10
A box contains ten assemblies of which two ar defective. A sample of three
assemblies is selected at random. What is the probability that the two defective
parts will be selected? (For this to occur there must be two defective parts and
one good part in the sample.)

8.2 The Binomial Probability Distribution
The binomial probability formula is used when events are classified in two ways
such as good/defective, red/green, go/no-go, etc. The prefix Bi means two. The
events or trials must be independent. When the binomial formula is used, it is
assumed that the lot size is infinite and the probability of a single success is
constant from trial to trial.
The binomial probability formula is be used to answer the following question:
What is the probability of x successes in n trials where the probability of a single
success is p? .
The binomial formula is

Example 11
A coin is tossed five times. (This is the same as a sample size of five). What is
the probability of obtaining exactly two heads in the five tosses?
It is known, by prior knowledge, that the probability of a single success
(probability of a head in one toss of a coin) is fifty percent. The question is
looking for two successes or two heads in five tosses of a coin. A success is the
outcome that is desired to occur.

For this example:
- The number of trials = n = 5
- The probability of a single event = p = 1/2
- The number of successes that the question is seeking (x = 2).
To arrive at the answer to the question the values are entered in the binomial
formula.

Example 12
In manufacturing screwdrivers, it was empirically determined that the process
yields, on average, 5% defective product. What is the probability that in a sample
of ten screwdrivers there are exactly three defective units?
n = 10, p = .05, x = 3

Example 13
A company produces electronic chips by a process that normally averages 2%
defective products. A sample of four chips is selected at random and the parts
are tested for certain characteristics.
a. What is the probability that exactly one chip is defective?

b. What is the probability that more than one chip is defective?
More than one defective chip in a sample of four means two, three or four
defective chips. The probability of each may be calculated using the
binomial formula.
P(more than 1 defective chip) = P(2) or P(3) or P(4) = P(2) + P(3) + P(4)
In any trial or sample, the sum of the probabilities of the individual events
always equal one. In this problem: P(0) + P(1) + P(2) + P(3) + P(4) = 1
P(more than 1 defective) = 1 - [P(0) + P(1)] = 1 - [.9224 +.0753] = .0023

8.3 The Poisson Probability Distribution
The Poisson distribution is the mathematical limit to the binomial distribution and
may be used to approximate binomial probabilities. The Poisson is also a
distribution in its own right when solving problems involving defects per unit
rather than fraction defectives. Tables showing subsets of Poisson probabilities
appear in many textbooks. The tables greatly simplify the solution of many
problems. The most extensive Poisson table is Poisson's Exponential Binomial
Limit by E. C. Molina. The tables were developed in the 1920s and published in
1949.
If n is large and p is small so that n times p (np) is a positive number less than
five, then the Poisson is a good approximation to the binomial. The value p and
the ratio n/N should be less than 0.10.
When solving binomial problems with the Poisson formula, the terms n, x and p
are the same as in the binomial formula. The task is to calculate the probability of
x successes in n trials, where the probability of a single success is p. Remember
that p is a fraction defective when used to approximate the binomial, and p is
defects per unit when counting the number of defects instead of the number of
defective units.
In some cases neither n nor p is given, but the product np may be given. If p is a
fraction defective then np is the average number of defective units in the sample.
If p is in terms of defects per unit then np is the average number of defects in the
sample.

The Poisson formula is

Example 14
In making switches, it has been determined by empirical studies that there is, on
average, one defect per switch. What is the probability of selecting a sample of
five switches that contains zero defects? There are two methods to solve this
problem. The first method is to use the above formula where x = 0, n = 5, and p =
1, therefore
np = 5 x 1 = 5.

The second and most widely used method is to use the Poisson tables that are
published in most statistics books. To use the tables, find the value of x in the
leftmost column, then find the value of np on the top row and read P(x) at the
intersection of the two values.
The Poisson table value for P(0) = .006738 or .674%
Example 15
In a paper making operation it was found that each 1000 foot roll contained, on
average, one defect. One roll is selected at random from the process.
a. What is the probability that this roll contains zero defects?
Use the Poisson table where x = 0 and np = 1. The Poisson table value for P(0) = .368.
b. What is the probability that the roll contains exactly three defects?
The Poisson table value for P(3) = .061
c. What is the probability that this roll contains more than one defect?
P(more than one defect) = P(2) + P(3) + P(4) + + P( )
= 1 - [P(0) + P(1)]
= 1 - [.368 + .368] = .264
Example 16
In manufacturing the Que model car, a study determined that on average there
are three defects per car. What is the probability of buying a Que with less than
three defects?
P(less than 3 defects) = P(0) + P(1) + P(2)
Use the Poisson tables and find P(0), P(1) and P(2) where np = 3
P(less than 3 defects) = .049 + .149 + .224 = .422

9.0 CONDITIONAL PROBABILITY
Conditional probability is defined as the probability of an event occurring if another has
occurred or has been specified to occur simultaneously, and the outcome of the first
event affects the probability of the second event. Conditional events are not
independent.
The probability of B occurring given that A has already occurred is stated as P(B/A),
where the symbol / means "given that."
The formulas for conditional probability are shown below. These are known as Bayes
Formulas.

Since the two formulas have a common term P(A & B), they may be used together to
solve many problems involving conditional probability.
Conditional events are not independent so P(A & B) is not equal to P(A) X P(B). From
Bayes formulas:

P(A & B) = P(B/A) P(A)
P(A & B) = P(A/B) P(B)

Example 17
A lot of fifteen items contains five defective items. Two items are drawn at
random. What is the probability that the second item drawn will be
defective?

Let A = event that first item is defective
Let A' = event that first item is good
Let B = event that second item is defective
The question stated in probability terms: what is P(B) = ?
P(A) = 5/15, P(A') = 10/15
P(B) = P(A & B) or P(A' & B) P(first item defective & second item defective) or
P(first item good & second item defective)
P(B) = P(B/A) P(A) or P(B/A') P(A')
P(B) = P(B/A) P(A) + P(B/A') P(A')
P(B) = (4/14)(5/15) + (5/14)(10/15)
P(B) = (20/210) + (50/210) = 70/210 = .333

Example 18
It has been found that 10% of certain relays have bent covers and will not work. If
40% have bent covers, what is the probability that a relay with a bent cover will
not work?
A and B bent AND will not work 0.1
A = 0.4

Let A = event that relays
have bent covers
Let B = event that relays
will not work
Given: P(A & B) = .10,
P(A) = .40
The first formula of the
conditional probability
formulas, Bayes formulas,
gives the following
solution:

-

11.0 VENN DIAGRAMS
Venn diagrams show the
events and corresponding
probabilities in graphical
form. The events are
shown as circles and the
shaded area within the
circles represent the
probabilities.

Bayes Theorem

Goal: To gain an understanding of Bayes Theorem and to use that knowledge to investigate
practical problems in various professional fields.

A particular test correctly identifies those with a certain serious disease 94% of the time and
correctly diagnoses those without the disease 98% of the time. A friend has just informed you
that he has received a positive result and asks for your advice about how to interpret these
probabilities. He knows nothing about probability, but he feels that because the test is quite
accurate, the probability that he does have the disease is quite high, likely in the 95% range. You
want to use your knowledge of probability to address your friends concerns. What is the
probability your friend actually has the disease? Well tackle this problem a little later using
Bayes Theorem. Right now, lets focus our attention on ideas that lead us to Bayes Theorem.
Specifically, well look at conditional probability and the multiplication rule for two dependent
events.

The conditional probability of an event B in relationship to an event A is the probability that
event B occurs after event A has already occurred.

We denote probability of event B given event A has occurred by:
( )
P B A

Multiplication Rule (two dependent events):
( ) ( ) ( ) ( ) ( ) ( )
and and P A B P A P B A P B A P B P A B = = =

The multiplication rule gives us a method for finding the probability that events A and B both
occur, as illustrated by the next two examples.

Example 1:
In a class with 3/5 women and 2/5 men, 25% of the women are business majors. Find the
probability that a student chosen from the class at random is a female business major.

Define the relevant events: W = the student is a woman
B = the student is a business major

Express the given information and question in probability notation:
class with 3/5 women
( ) 3/ 5 0.60 P W = =

25% of the women are business majors is the same as saying the probability a student is a
business major, given the student is a woman is 0.25
( )
0.25 P B W =

probability that a student chosen from the class at random is a female business major is the
same as saying probability student is a woman and a business major
( ) and P W B

Use the multiplication rule to answer the question:
( ) ( ) ( ) ( )( ) and 0.60 0.25 0.15 P W B P W P B W = = =
Example 2:
A box contains 5 red balls and 9 green balls. Two balls are drawn in succession without
replacement. That is, the first ball is selected and its color is noted but it is not replaced, then a
second ball is selected. What is the probability that:
a. the first ball is green and the second ball is green?
b. the first ball is green and the second ball is red?
c. the first ball is red and the second ball is green?
d. the first ball is red and the second ball is red?

Solutions:
We will construct a tree diagram to help us answer these questions.

Using the tree diagram, we see that:
a. the probability the first ball is green and the second ball is green = ( )
36
1 and 2
91
P G G =

b. the probability the first ball is green and the second ball is red = ( )
45
1 and 2
182
P G R =

c. the probability the first ball is red and the second ball is green = ( )
45
1 and 2
182
P R G =

d. the probability the first ball is red and the second ball is red = ( )
10
1 and 2
91
P R R =

Formula for Conditional Probability:
The probability that the second event B occurs given that the first event A has occurred can be
found by:
( ) ( )
( and )
, where 0
( )
P A B
P B A P A
P A
= =

Note: This formula is obtained from the Multiplication Rule for two dependent events.
(Using algebra, we solve for P(BA) by dividing both sides of the equation by P(A))

The key to solving conditional probability problems is to:
1. Define the events.
2. Express the given information and question in probability notation.
3. Apply the formula.

Example 3:
The probability that Sam parks in a no-parking zone and gets a parking ticket is 0.06. The
probability that Sam has to park in a no-parking zone (he cannot find a legal parking space) is
0.20. Today, Sam arrives at school and has to park in a no-parking zone. What is the probability
that he will get a parking ticket?

Solution:
Define the events: N = Sam parks in a no-parking zone, T = Sam gets a parking ticket

probability that Sam parks in a no-parking zone and gets a parking ticket is 0.06
tells us that P(N and T) = 0.06.

probability Sam has to park in the no-parking zone is 0.20 tells us that P(N) = 0.20

Today, Sam arrives at school and has to park in a no-parking zone. What is the probability that
he will get a parking ticket? is the same as What is the probability he will get a parking ticket,
given that he has to park in a no-parking zone That is, we want to find
( )
P T N .

Apply the formula:
( )
( and ) 0.06
0.30
( ) 0.20
P N T
P T N
P N
= = =

Note: Students seem to have difficulties understanding that the question asks us to find
( )
P T N ,
not ( and ) P N T . They think the answer is 0.06. They fail to consider that Sam could park in a
no-parking zone but not receive a ticket. It might be useful to construct a Venn diagram.

The Law of Total Probability:
( ) ( )
( ) ( ) ( )
( ) ( ) ( ) ( )
1 2 1 2
1 2
1 1 2 2
If and are mutually exclusive events with 1, then for any event ,
and and

A A P A P A B
P B P A B P A B
P A P B A P A P B A
+ =
= +
= +

( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( )
1 2 1 2
1 2
1 1 2 2
More generally, if , , . . . , are mutually exclusive events with 1,
then for any event ,
and and and

k k
k
k k
A A A P A P A P A
B
P B P A B P A B P A B
P A P B A P A P B A P A P B A
+ + + =
= + + +
= + + +

Example 4: An automobile dealer has kept records on the customers who visited his showroom.
Forty percent of the people who visited his dealership were women. Furthermore, his records
show that 37% of the women who visited his dealership purchased an automobile, while 21% of
the men who visited his dealership purchased an automobile.
a. What is the probability that a customer entering the showroom will buy an automobile?
b. Suppose a customer visited the showroom and purchased a car. What is the probability that
the customer was a woman?
c. Suppose a customer visited the showroom but did not purchase a car. What is the probability
that the customer was a man?

Define the events:
1
2
customer is a woman
customer is a man
customer purchases an automobile
customer does not purchase an automobile
C
A
A
B
B
=
=
=
=

Forty percent of the people who visited his dealership were women
( )
1
0.40 P A =
this statement also tells us that 60% of the customers must be men
( )
2
0.60 P A =

37% of the women who visited his dealership purchased an automobile
( )
1
0.37 P B A =
21% of the men who visited his dealership purchased an automobile
( )
2
0.21 P B A =

What is the probability that a customer entering the showroom will buy an automobile?
( ) ? P B =

Create a tree diagram:

Use your tree diagram and the Law of Total Probability to answer the question:
( ) 0.274 P B =

Solution to part b:
Suppose a customer visited the showroom and purchased a car. What is the probability that the
customer was a woman?
Express the question in probability notation:
We can rewrite the question as, What is the probability that the customer was a woman, given
that the customer purchased an automobile. That is, we want to find
( )
1
P A B

We can use Bayes Theorem to help us compute this conditional probability.
Bayes Theorem (Two-Event Case):
( )
( ) ( )
( ) ( ) ( ) ( )
( )
( ) ( )
( )
( )
( ) ( )
( )
1 1 1 1
1
1 2 1 1 2 2
1 2 1 2
and and
and and
where and are mutually exclusive events with 1
and is any event with 0
P A P B A P A B P A B
P A B
P A B P A B P B P A P B A P A P B A
A A P A P A
B P B
= = =
+ +
+ =
=

Note: the denominator is determined by the Law of Total Probability
Solution to part b (continued):
Use Bayes Theorem and your tree diagram to answer the question:
( )
( ) ( )
( ) ( ) ( ) ( )
( )
( ) ( )
1 1 1
1
1 2 1 2 2 2
and
0.148 0.148
0.540
and and 0.148 0.126 0.274
P A P B A P A B
P A B
P A B P A B P A P B A P A P B A
= = = = ~
+ + +

The probabilities needed for the computation are easily obtained from our tree diagram.
We already found
( ) ( )
1 2
and and P A B P A B + , which is
( ) P B , for part a.) of this example and
( )
1
and P A B is obtained by following the tree diagram path
1
A B , the product of the
corresponding probabilities is 0.148.

Solution to part c:
Suppose a customer visited the showroom but did not purchase a car. What is the probability
that the customer was a man?
Express the question in probability notation:
We can rewrite the question as, What is the probability that the customer was a man, given that
the customer did not purchase an automobile. That is, we want to find
( )
2
C
P A B

( )
( ) ( )
( ) ( ) ( ) ( )
2 2
2
2 2 1 1
0.474 0.474
0.653
0.474 0.252 0.726
C
C
C C
P A P B A
P A B
P A P B A P A P B A
= = = ~
+ +

Again, the probabilities needed for the computation are easily obtained from our tree diagram.

Additional Notes:
The probabilities
( ) ( )
1 2
and P A P A are called prior probabilities because they are initial or prior
probability estimates for specific events of interest. When we obtain new information about the
events we can update the prior probability values by calculating revised probabilities, referred to
as posterior probabilities. The conditional probabilities
( )
1
P A B ,
( )
2
P A B ,
( )
1
C
P A B , and
( )
2
C
P A B are posterior probabilities. Bayes Theorem enables us to compute these posterior
probabilities.

Example 5:
Lets return to the scenario that began our discussion: A particular test correctly identifies those
with a certain serious disease 94% of the time and correctly diagnoses those without the disease
98% of the time. A friend has just informed you that he has received a positive result and asks
for your advice about how to interpret these probabilities. He knows nothing about probability,
but he feels that because the test is quite accurate, the probability that he does have the disease is
quite high, likely in the 95% range. Before attempting to address your friends concern, you
research the illness and discover that 4% of men have this disease. What is the probability your
friend actually has the disease?
Define the events:
1
2
a man has this disease
a man does not have this disease
positive test result
negative test result
C
A
A
B
B
=
=
=
=

test correctly identifies those with a certain serious disease 94% of the time
( )
1
0.94 P B A =
test correctly diagnoses those without the disease 98% of the time
( )
2
0.98
C
P B A =

you discover that 4% of men have this disease
( )
1
0.04 P A =
this statement also tells us that 96% of men do not have the disease
( )
2
0.96 P A =

What is the probability your friend actually has the disease (given a positive result)?
( )
1
? P A B =

Construct a tree diagram:


( )
( ) ( )
( ) ( ) ( ) ( )
1 2
1
1 2 2 2
0.0376
0.662
0.0376 0.0192
P A P B A
P A B
P A P B A P A P B A
= = ~
+ +

There is a 66.2% probability that he actually has the disease. The probability is high, but
considerably lower than your friend feared.

A probability distribution for a discrete random variable is a listing of all possible
distinct outcomes and their probabilities of occurring. Since all possible outcomes are
listed, the sum of the probabilities must add to 1.0.

Example Coin flips.
Suppose we let the random variable be X = the number of heads in three flips of a fair coin.
Then:

P(HHH) = 1/8, P(HHT) = 1/8, P(HTH) = 1/8, P(THH) = 1/8,
P(TTH) = 1/8, P(THT) = 1/8, P(HTT) = 1/8, P(TTT) = 1/8.

x

0

1

2

3

p(x)

1/8

3/8

3/8

1/8

Suppose coin is weighted with:
P(HHH) = 1/27, P(HHT) = 2/27, P(HTH) = 2/27,
P(THH) = 2/27, P(TTH) = 4/27, P(THT) = 4/27,
P(HTT) = 4/27, P(TTT) = 8/27.

x

0

1

2

3

p(x)

8/27

12/27

6/27

1/27

Both satisfy the definition of a probability distribution because all outcomes (0, 1, 2, and 3) are
listed and the sum of the probabilities equals 1.0.

The Expected Value or Average of a Random Variable

The mean (
x
) of a probability distribution is called the expected value of the random variable.

The expected value of a random variable is defined as its weighted average over all possible
outcomes with the weights being the relative frequency or probability associated with each of the
outcomes.

=
= =
N
1 i
i i x
) X ( P X ) X ( E
where
X = random variable of interest
X
i
= i
th
outcome of X
P(X
i
) = probability of occurrence of the i th outcome of X
i = 1, 2, ... , N
N= the number of outcomes for X

Example Coin Flips

x

0

1

2

3

p(x)

1/8

3/8

3/8

1/8

x
= 0(1/8) + 1(3/8) + 2(3/8) + 3(1/8) = 12/8 = 3/2 = 1.5

Variance and Standard Deviation of a Random Variable

The variance (o
2
x
) of a random variable is defined as the weighted average of the
squared differences between each possible outcome and the average value of the
outcomes, with the weights being the probability associated with each of the outcomes.

=
= o
N
1 i
i
2
x i
2
x
) X ( P ) X (
where
X = random variable of interest
X
i
= i th outcome of X
P(X
i
) = probability of occurrence of the i th outcome of X
i = 1, 2, ... , N
N = the number of outcomes for X

In addition, the standard deviation, o
x
, of the probability distribution of a random variable, is the
square root of the variance and is given by:

x
o
=
2
(Xi
x
)
P(Xi)
i=1
N

Example Coin Flips

x

0

1

2

3

p(x)

1/8

3/8

3/8

1/8

x
= (0-{3/2})(1/8) + (1-{3/2})(3/8)
+ (2-{3/2})(3/8) + (3-{3/2})(1/8) = 24/32 = 3/4 = .75

o
x
= .75 = .866

FORECASTING
Business forecasting has always been one component of running an enterprise. However,
forecasting traditionally was based less on concrete and comprehensive data than on face-to-face
meetings and common sense. In recent years, business forecasting has developed into a much
more scientific endeavor, with a host of theories, methods, and techniques designed for
forecasting certain types of data. The development of information technologies and the Internet
propelled this development into overdrive, as companies not only adopted such technologies into
their business practices, but into forecasting schemes as well.
Business forecasting involves a wide range of tools, including simple electronic spreadsheets,
enterprise resource planning (ERP) and electronic data interchange (EDI) networks, advanced
supply chain management systems, and other Web-enabled technologies. The practice attempts
to pinpoint key factors in business production and extrapolate from given data sets to produce
accurate projections for future costs, revenues, and opportunities. This normally is done with an
eye toward adjusting current and near-future business practices to take maximum advantage of
expectations.
There are three models of business forecasting systems.
In the time-series model, data simply is projected forward based on an established methodof
which there are several, including the moving average, the simple average, exponential
smoothing, decomposition, and Box-Jenkins. Each of these methods applies various formulas to
the same basic premise: data patterns from the recent past will continue more or less unabated
into the future. To conduct a forecast using the time-series model, one need only plug available
historical data into the formulas established by one or more of the above methods. Obviously, the
time-series model is the most useful means for forecasting when the relevant historical data
reveals smooth and stable patterns. Where jumps and anomalies do occur, the time-series model
may still be useful, providing those jumps can be accounted for.
The second forecasting model is cause-and-effect. In this model, one assumes a cause, or driver
of activity, that determines an outcome. For instance, a company may assume that, for a
particular data set, the cause is an investment in information technology, and the effect is sales.
This model requires the historical data not only of the factor with which one is concerned (in this
case, sales), but also of that factor's determined cause (here, information technology
expenditures). It is assumed, of course, that the cause-and-effect relationship is relatively stable
and easily quantifiable.
The third primary forecasting model is known as the judgmental model. In this case, one
attempts to produce a forecast where there is no useful historical data. A company might choose
to use the judgmental model when it attempts to project sales for a brand new product, or when
market conditions have qualitatively changed, rendering previous data obsolete.

FORECASTING METHODS
Multiple Regression Analysis: Used when two or more independent factors are involved-
widely used for intermediate term forecasting. Used to assess which factors to include and which
to exclude. Can be used to develop alternate models with different factors.
Nonlinear Regression: Does not assume a linear relationship between variables-frequently used
when time is the independent variable.
Trend Analysis: Uses linear and nonlinear regression with time as the explanatory variable-used
where pattern over time.
Decomposition Analysis: Used to identify several patterns that appear simultaneously in a time
series-time consuming each time it is used-also used to deseasonalize a series
Moving Average Analysis: Simple Moving Averages-forecasts future values based on a
weighted average of past values-easy to update.
Weighted Moving Averages: Very powerful and economical. They are widely used where
repeated forecasts required-uses methods like sum-of-the-digits and trend adjustment methods.
Adaptive Filtering: A type of moving average which includes a method of learning from past
errors-can respond to changes in the relative importance of trend, seasonal, and random factors.
Exponential Smoothing: A moving average form of time series forecasting-efficient to use with
seasonal patterns- easy to adjust for past errors-easy to prepare follow-on forecasts-ideal for
situations where many forecasts must be prepared-several different forms are used depending on
presence of trend or cyclical variations.
Decision trees - Decision trees originally evolved as graphical devices to help illustrate the
structural relationships between alternative choices. These trees were originally presented as a
series of yes/no (dichotomous) choices. As our understanding of feedback loops improved,
decision trees became more complex. Their structure became the foundation of computer flow
charts.
Computer technology has made it possible create very complex decision trees consisting of many
subsystems and feedback loops. Decisions are no longer limited to dichotomies; they now
involve assigning probabilities to the likelihood of any particular path.
Decision theory is based on the concept that an expected value of a discrete variable can be
calculated as the average value for that variable. The expected value is especially useful for
decision makers because it represents the most likely value based on the probabilities of the
distribution function.
Modeling and Simulation: Model describes situation through series of equations-allows testing
of impact of changes in various factors-substantially more time-consuming to construct-
generally requires user programming or purchase of packages such as SIMSCRIPT. Can be very
powerful in developing and testing strategies otherwise non-evident.
Certainty models give only most likely outcome-advanced spreadsheets can be utilized to do
"what if" analysis-often done e.g.; with computer-based spreadsheets.
Probabilistic Models Use Monte Carlo simulation techniques to deal with uncertainty-gives a
range of possible outcomes for each set of events.
Forecasting in Business
Business leaders and economists are continually involved in the process of trying to forecast, or
predict, the future of business in the economy. Business leaders engage in this process because
much of what happens in businesses today depends
on what is going to happen in the future. For example, if a business is trying to make a decision
about developing a revolutionary new automobile, it would be nice to know whether the
economy is going to be in a recession or whether it will be booming when the automobile is
released to the general public. If there is a recession, consumers will not buy the automobile
unless it can save them money, and the manufacturer will have spent millions or billions of
dollars on the development of a product that might not sell.
The process of attempting to forecast the future is not new. Most ancient civilizations used some
method for predicting the future. Today, computers with elaborate programs are often used to
develop models to forecast future economic and business activity. Contemporary models of
economic and business forecasting have been developed in the last century. Today's forecasting
models are considerably more statistical than they were hundreds of years ago when the stars,
and other mystical methods, were used to predict the future. Almost every large business or
government agency performs some type of formalized forecasting.
Forecasting in business is closely related to understanding the business cycle. The foundations of
modern forecasting were laid in 1865 by William Stanley Jevons, who argued that
manufacturing had replaced agriculture as the dominant sector in English society. He studied the
effects of economic fluctuations of the limiting factors of coal production on economic
development.
Forecasting has become big business around the world. Forecasters try to predict what the stock
markets will do, what the economy will do, what numbers to pick in the lottery, who will win
sporting events, and almost anything one might name. Regardless of who does it, forecasting is
done to identify what is likely to happen in the future so as to be able to benefit most from the
events.
QUALITATIVE FORECASTING MODELS
Qualitative forecasting models have often proven to be most effective for short-term projections.
In this method of forecasting, which works best when the scope is limited, experts in the
appropriate fields are asked to agree on a common forecast. Two methods are used frequently.
Delphi Method. This method involves asking various experts what they anticipate will happen
in the future relative to the subject under consideration. Experts in the automotive industry, for
example, might be asked to forecast likely innovative enhancements for cars five years from
now. They are not expected to be precise, but rather to provide general opinions.
Market Research Method. This method involves surveys and questionnaires about people's
subjective reactions to changes. For example, a company might develop a new way to launder
clothes; after people have had an opportunity to try the new method, they would be asked for
feedback about how to improve the processes or how it might be made more appealing for the
general public. This method is difficult because it is hard to identify an appropriate sample that is
representative of the larger audience for whom the product is intended.
QUANTITATIVE FORECASTING MODELS
Three quantitative methods are in common use.
Time-Series Methods. This forecasting model uses historical data to try to predict future events.
For example, assume that you are interested in knowing how long a recession will last. You
might look at all past recessions and the events leading up to and surrounding them and then,
from that data, try to predict how long the current recession will last.
A specific variable in the time series is identified by the series name and date. If gross domestic
product (GDP) is the variable, it might be identified as GDP2000.1 for the first-quarter statistics
for the year 2000. This is just one example, and different groups might use different methods to
identify variables in a time period.
Many government agencies prepare and release time-series data. The Federal Reserve, for
example, collects data on monetary policy and financial institutions and publishes that data in the
Federal Reserve Bulletin. These data become the foundation for making decisions about
regulating the growth of the economy.
Time-series models provide accurate forecasts when the changes that occur in the variable's
environment are slow and consistent. When large-degree changes occur, the forecasts are not
reliable for the long term. Since time-series forecasts are relatively easy and inexpensive to
construct, they are used quite extensively.
The Indicator Approach. The U.S. government is a primary user of the indicator approach of
forecasting. The government uses such indicators as the Composite Index of Leading, Lagging,
and Coincident Indicators, often referred to as Composite Indexes. The indexes predict by
assuming that past trends and relationships will continue into the future. The government indexes
are made by averaging the behavior of the different indicator series that make up each composite
series.
The timing and strength of each indicator series relationship with general business activity,
reflected in the business cycle, change over time. This relationship makes forecasting changes in
the business cycle difficult.
Econometric Models. Econometric models are causal models that statistically identify the
relationships between variables and how changes in one or more variables cause changes in
another variable. Econometric models then use the identified relationship to predict the future.
Econometric models are also called regression models.
There are two types of data used in regression analysis. Economic forecasting models
predominantly use time-series data, where the values of the variables change over time.
Additionally, cross-section data, which capture the relationship between variables at a single
point in time, are used. A lending institution, for example, might want to determine what
influences the sale of homes. It might gather data on home prices, interest rates, and statistics on
the homes being sold, such as size and location. This is the cross-section data that might be used
with time-series data to try to determine such things as what size home will sell best in which
location.
An econometric model is a way of determining the strength and statistical significance of a
hypothesized relationship. These models are used extensively in economics to prove, disprove,
or validate the existence of a casual relationship between two or more variables. It is obvious that
this model is highly mathematical, using different statistical equations.
For the sake of simplicity, mathematical analysis is not addressed here. Just as there are these
qualitative and quantitative forecasting models, there are others equally as sophisticated;
however, the discussion here should provide a general sense of the nature of forecasting models.
THE FORECASTING PROCESS
When beginning the forecasting process, there are typical steps that must be followed. These
steps follow an acceptable decision-making process that includes the following elements:
1. Identification of the problem. Forecasters must identify what is going to be forecasted, or
what is of primary concern. There must be a timeline attached to the forecasting period.
This will help the forecasters to determine the methods to be used later.
2. Theoretical considerations. It is necessary to determine what forecasting has been done
in the past using the same variables and how relevant these data are to the problem that is
currently under consideration. It must also be determined what economic theory has to
say about the variables that might influence the forecast.
3. Data concerns. How easy will it be to collect the data needed to be able to make the
forecasts is a significant issue.
4. Determination of the assumption set. The forecaster must identify the assumptions that
will be made about the data and the process.
5. Modeling methodology. After careful examination of the problem, the types of models
most appropriate for the problem must be determined.
6. Preparation of the forecast. This is the analysis part of the process. After the model to be
used is determined, the analysis can begin and the forecast can be prepared.
7. Forecast verification. Once the forecasts have been made, the analyst must determine
whether they are reasonable and how they can be compared against the actual behavior of
the data.
Each of the seven steps has substages; however, the steps that have been presented are the major
concerns to the forecaster. Those with a deep interest in forecasting might pursue more in-depth
treatments.
FORECASTING CONCERNS
Forecasting does present some problems. Even though very detailed and sophisticated
mathematical models might be used, they do not always predict correctly. There are some who
would argue that the future cannot be predicted at all period!
Some of the concerns about forecasting the future are that (1) predictions are made using
historical data, (2) they fail to account for unique events, and (3) they ignore coevolution
(developments created by our own actions). Additionally, there are psychological challenges
implicit in forecasting. An example of a psychological challenge is when plans based on
forecasts that use historical data become so confining as to prohibit management freedom. It is
also a concern that many decision makers feel that because they have the forecasting data in
hand they have control over the future.

Statistical inference
Population -collection of objects having some common characteristic of interest
under the consideration for a statistical investigation.
Sample- a finite subset of population.
Sample error- the inherent and unavoidable error caused while approximating the
characteristic of the object.
Random sample if n objects are selected from a population each of them are
equiprobable of getting selected.
Standard error-standard deviation of sampling distribution.
Confidence interval and confidence limits-In order to find the population mean we
cannot draw large number of the samples occurring in the entire population. So we
setup certain limits on both sides of the population mean on the basis that the mean
of samples are normally distributed around the population mean.These limits are
called confidence limits and range between the two is called the confidence interval.
The field of statistical inference consists of those methods used to make decisions or
draw conclusions about a population. These methods utilize the information
contained in a samplefrom the population in drawing conclusions.
Point Estimation

Hypothesis Testing

For example, suppose that we are interested in the burning rate of a solid propellant used
to power aircrew escape systems.
Now burning rate is a random variable that can be described by a probability
distribution.
Suppose that our interest focuses on the mean burning rate (a parameter of this
distribution).
Specifically, we are interested in deciding whether or not the mean burning rate is
50 centimeters per second.
Null hypotheses hypothesis which is being tested for possible rejection.
Alternative hypotheses-The hypothesis which is accepted when the null hypotheses
is rejected .
Critical region-The set of all those samples which lead to the rejection of null
hypothesis.
Level of significance-is the probability of rejection of null hypothesis when it is
actually true.
Two-sided Alternative Hypothesis

One-sided Alternative Hypotheses

Test of a Hypothesis
A procedure leading to a decision about a particular hypothesis
Hypothesis-testing procedures rely on using the information in a random sample
from the population of interest.
If this information is consistent with the hypothesis, then we will conclude that the
hypothesis is true; if this information is inconsistent with the hypothesis, we will
conclude that the hypothesis is false.

The power is computed as 1 - b, and power can be interpreted as the probability of
correctly rejecting a false null hypothesis. We often compare statistical tests by
comparing their power properties.
For example, consider the propellant burning rate problem whenwe are testing H
0

: m = 50 centimeters per second against H
1
: m not equal 50 centimeters per second
. Suppose that the true value of the mean is m = 52. When n =10, we found that b =
0.2643, so the power of this test is 1 - b = 1 - 0.2643 = 0.7357 when m = 52
General Procedure for Hypothesis Testing

T test
In many real life problems population mean is known the exact population standard
deviation cant be calculated.In such cases t test is used.
Sample size of 30-40
Types
One sample t test is used to compare the mean of a single sample with the
population mean.
An economist wants to know if the per capita income of a particular region is same
as the national average.
Independent sample t test-detecting differences between the means of two
independent groups.
An economist wants to compare the per capita income of two different region.
Z test
For z test population mean and population standard deviation should be known.
Large sample size.
Analysis of variance
ANOVA is used to compare the means of more than two population
Extensive application in consumer behavior and marketing management related
problems.
A marketing manager wants to investigate the impact of different discount schemes
on sale of three major brands of edible oil.
F statistic
ANOVA uses F statistic ,which tests if the means o the groups formed by one
independent variable or combination of independent variable are significantly
different.It is based on the comparison of variance.
Condition- dependent variable should be interval or ratio,the population should be
normally distributed.
Chi square test
One of the popular methods for testing hypothesis on discrete data.
Is used to test the hypothesis that two categorical variables are independent of each
other.
An organizations research wants to determine if the satisfaction level of the firm is
dependent on their placements

Statistics Notes

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Statistics Notes

Загружено:

Авторское право:

Доступные форматы

Statistics

Statistics is a mathematical science pertaining to the collection, analysis, interpretation or

Вам также может понравиться