Академический Документы
Профессиональный Документы
Культура Документы
= = =
+ +
+ =
=
Note: the denominator is determined by the Law of Total Probability
Solution to part b (continued):
Use Bayes Theorem and your tree diagram to answer the question:
( )
( ) ( )
( ) ( ) ( ) ( )
( )
( ) ( )
1 1 1
1
1 2 1 2 2 2
and
0.148 0.148
0.540
and and 0.148 0.126 0.274
P A P B A P A B
P A B
P A B P A B P A P B A P A P B A
= = = = ~
+ + +
The probabilities needed for the computation are easily obtained from our tree diagram.
We already found
( ) ( )
1 2
and and P A B P A B + , which is
( ) P B , for part a.) of this example and
( )
1
and P A B is obtained by following the tree diagram path
1
A B , the product of the
corresponding probabilities is 0.148.
Solution to part c:
Suppose a customer visited the showroom but did not purchase a car. What is the probability
that the customer was a man?
Express the question in probability notation:
We can rewrite the question as, What is the probability that the customer was a man, given that
the customer did not purchase an automobile. That is, we want to find
( )
2
C
P A B
Use Bayes Theorem and your tree diagram to answer the question:
( )
( ) ( )
( ) ( ) ( ) ( )
2 2
2
2 2 1 1
0.474 0.474
0.653
0.474 0.252 0.726
C
C
C C
P A P B A
P A B
P A P B A P A P B A
= = = ~
+ +
Again, the probabilities needed for the computation are easily obtained from our tree diagram.
Additional Notes:
The probabilities
( ) ( )
1 2
and P A P A are called prior probabilities because they are initial or prior
probability estimates for specific events of interest. When we obtain new information about the
events we can update the prior probability values by calculating revised probabilities, referred to
as posterior probabilities. The conditional probabilities
( )
1
P A B ,
( )
2
P A B ,
( )
1
C
P A B , and
( )
2
C
P A B are posterior probabilities. Bayes Theorem enables us to compute these posterior
probabilities.
Example 5:
Lets return to the scenario that began our discussion: A particular test correctly identifies those
with a certain serious disease 94% of the time and correctly diagnoses those without the disease
98% of the time. A friend has just informed you that he has received a positive result and asks
for your advice about how to interpret these probabilities. He knows nothing about probability,
but he feels that because the test is quite accurate, the probability that he does have the disease is
quite high, likely in the 95% range. Before attempting to address your friends concern, you
research the illness and discover that 4% of men have this disease. What is the probability your
friend actually has the disease?
Define the events:
1
2
a man has this disease
a man does not have this disease
positive test result
negative test result
C
A
A
B
B
=
=
=
=
Express the given information and question in probability notation:
test correctly identifies those with a certain serious disease 94% of the time
( )
1
0.94 P B A =
test correctly diagnoses those without the disease 98% of the time
( )
2
0.98
C
P B A =
you discover that 4% of men have this disease
( )
1
0.04 P A =
this statement also tells us that 96% of men do not have the disease
( )
2
0.96 P A =
What is the probability your friend actually has the disease (given a positive result)?
( )
1
? P A B =
Construct a tree diagram:
Use Bayes Theorem and your tree diagram to answer the question:
( )
( ) ( )
( ) ( ) ( ) ( )
1 2
1
1 2 2 2
0.0376
0.662
0.0376 0.0192
P A P B A
P A B
P A P B A P A P B A
= = ~
+ +
There is a 66.2% probability that he actually has the disease. The probability is high, but
considerably lower than your friend feared.
A probability distribution for a discrete random variable is a listing of all possible
distinct outcomes and their probabilities of occurring. Since all possible outcomes are
listed, the sum of the probabilities must add to 1.0.
Example Coin flips.
Suppose we let the random variable be X = the number of heads in three flips of a fair coin.
Then:
P(HHH) = 1/8, P(HHT) = 1/8, P(HTH) = 1/8, P(THH) = 1/8,
P(TTH) = 1/8, P(THT) = 1/8, P(HTT) = 1/8, P(TTT) = 1/8.
x
0
1
2
3
p(x)
1/8
3/8
3/8
1/8
Suppose coin is weighted with:
P(HHH) = 1/27, P(HHT) = 2/27, P(HTH) = 2/27,
P(THH) = 2/27, P(TTH) = 4/27, P(THT) = 4/27,
P(HTT) = 4/27, P(TTT) = 8/27.
x
0
1
2
3
p(x)
8/27
12/27
6/27
1/27
Both satisfy the definition of a probability distribution because all outcomes (0, 1, 2, and 3) are
listed and the sum of the probabilities equals 1.0.
The Expected Value or Average of a Random Variable
The mean (
x
) of a probability distribution is called the expected value of the random variable.
The expected value of a random variable is defined as its weighted average over all possible
outcomes with the weights being the relative frequency or probability associated with each of the
outcomes.
=
= =
N
1 i
i i x
) X ( P X ) X ( E
where
X = random variable of interest
X
i
= i
th
outcome of X
P(X
i
) = probability of occurrence of the i th outcome of X
i = 1, 2, ... , N
N= the number of outcomes for X
Example Coin Flips
x
0
1
2
3
p(x)
1/8
3/8
3/8
1/8
x
= 0(1/8) + 1(3/8) + 2(3/8) + 3(1/8) = 12/8 = 3/2 = 1.5
Variance and Standard Deviation of a Random Variable
The variance (o
2
x
) of a random variable is defined as the weighted average of the
squared differences between each possible outcome and the average value of the
outcomes, with the weights being the probability associated with each of the outcomes.
=
= o
N
1 i
i
2
x i
2
x
) X ( P ) X (
where
X = random variable of interest
X
i
= i th outcome of X
P(X
i
) = probability of occurrence of the i th outcome of X
i = 1, 2, ... , N
N = the number of outcomes for X
In addition, the standard deviation, o
x
, of the probability distribution of a random variable, is the
square root of the variance and is given by:
x
o
=
2
(Xi
x
)
P(Xi)
i=1
N
Example Coin Flips
x
0
1
2
3
p(x)
1/8
3/8
3/8
1/8
x
= (0-{3/2})(1/8) + (1-{3/2})(3/8)
+ (2-{3/2})(3/8) + (3-{3/2})(1/8) = 24/32 = 3/4 = .75
o
x
= .75 = .866
FORECASTING
Business forecasting has always been one component of running an enterprise. However,
forecasting traditionally was based less on concrete and comprehensive data than on face-to-face
meetings and common sense. In recent years, business forecasting has developed into a much
more scientific endeavor, with a host of theories, methods, and techniques designed for
forecasting certain types of data. The development of information technologies and the Internet
propelled this development into overdrive, as companies not only adopted such technologies into
their business practices, but into forecasting schemes as well.
Business forecasting involves a wide range of tools, including simple electronic spreadsheets,
enterprise resource planning (ERP) and electronic data interchange (EDI) networks, advanced
supply chain management systems, and other Web-enabled technologies. The practice attempts
to pinpoint key factors in business production and extrapolate from given data sets to produce
accurate projections for future costs, revenues, and opportunities. This normally is done with an
eye toward adjusting current and near-future business practices to take maximum advantage of
expectations.
There are three models of business forecasting systems.
In the time-series model, data simply is projected forward based on an established methodof
which there are several, including the moving average, the simple average, exponential
smoothing, decomposition, and Box-Jenkins. Each of these methods applies various formulas to
the same basic premise: data patterns from the recent past will continue more or less unabated
into the future. To conduct a forecast using the time-series model, one need only plug available
historical data into the formulas established by one or more of the above methods. Obviously, the
time-series model is the most useful means for forecasting when the relevant historical data
reveals smooth and stable patterns. Where jumps and anomalies do occur, the time-series model
may still be useful, providing those jumps can be accounted for.
The second forecasting model is cause-and-effect. In this model, one assumes a cause, or driver
of activity, that determines an outcome. For instance, a company may assume that, for a
particular data set, the cause is an investment in information technology, and the effect is sales.
This model requires the historical data not only of the factor with which one is concerned (in this
case, sales), but also of that factor's determined cause (here, information technology
expenditures). It is assumed, of course, that the cause-and-effect relationship is relatively stable
and easily quantifiable.
The third primary forecasting model is known as the judgmental model. In this case, one
attempts to produce a forecast where there is no useful historical data. A company might choose
to use the judgmental model when it attempts to project sales for a brand new product, or when
market conditions have qualitatively changed, rendering previous data obsolete.
FORECASTING METHODS
Multiple Regression Analysis: Used when two or more independent factors are involved-
widely used for intermediate term forecasting. Used to assess which factors to include and which
to exclude. Can be used to develop alternate models with different factors.
Nonlinear Regression: Does not assume a linear relationship between variables-frequently used
when time is the independent variable.
Trend Analysis: Uses linear and nonlinear regression with time as the explanatory variable-used
where pattern over time.
Decomposition Analysis: Used to identify several patterns that appear simultaneously in a time
series-time consuming each time it is used-also used to deseasonalize a series
Moving Average Analysis: Simple Moving Averages-forecasts future values based on a
weighted average of past values-easy to update.
Weighted Moving Averages: Very powerful and economical. They are widely used where
repeated forecasts required-uses methods like sum-of-the-digits and trend adjustment methods.
Adaptive Filtering: A type of moving average which includes a method of learning from past
errors-can respond to changes in the relative importance of trend, seasonal, and random factors.
Exponential Smoothing: A moving average form of time series forecasting-efficient to use with
seasonal patterns- easy to adjust for past errors-easy to prepare follow-on forecasts-ideal for
situations where many forecasts must be prepared-several different forms are used depending on
presence of trend or cyclical variations.
Decision trees - Decision trees originally evolved as graphical devices to help illustrate the
structural relationships between alternative choices. These trees were originally presented as a
series of yes/no (dichotomous) choices. As our understanding of feedback loops improved,
decision trees became more complex. Their structure became the foundation of computer flow
charts.
Computer technology has made it possible create very complex decision trees consisting of many
subsystems and feedback loops. Decisions are no longer limited to dichotomies; they now
involve assigning probabilities to the likelihood of any particular path.
Decision theory is based on the concept that an expected value of a discrete variable can be
calculated as the average value for that variable. The expected value is especially useful for
decision makers because it represents the most likely value based on the probabilities of the
distribution function.
Modeling and Simulation: Model describes situation through series of equations-allows testing
of impact of changes in various factors-substantially more time-consuming to construct-
generally requires user programming or purchase of packages such as SIMSCRIPT. Can be very
powerful in developing and testing strategies otherwise non-evident.
Certainty models give only most likely outcome-advanced spreadsheets can be utilized to do
"what if" analysis-often done e.g.; with computer-based spreadsheets.
Probabilistic Models Use Monte Carlo simulation techniques to deal with uncertainty-gives a
range of possible outcomes for each set of events.
Forecasting in Business
Business leaders and economists are continually involved in the process of trying to forecast, or
predict, the future of business in the economy. Business leaders engage in this process because
much of what happens in businesses today depends
on what is going to happen in the future. For example, if a business is trying to make a decision
about developing a revolutionary new automobile, it would be nice to know whether the
economy is going to be in a recession or whether it will be booming when the automobile is
released to the general public. If there is a recession, consumers will not buy the automobile
unless it can save them money, and the manufacturer will have spent millions or billions of
dollars on the development of a product that might not sell.
The process of attempting to forecast the future is not new. Most ancient civilizations used some
method for predicting the future. Today, computers with elaborate programs are often used to
develop models to forecast future economic and business activity. Contemporary models of
economic and business forecasting have been developed in the last century. Today's forecasting
models are considerably more statistical than they were hundreds of years ago when the stars,
and other mystical methods, were used to predict the future. Almost every large business or
government agency performs some type of formalized forecasting.
Forecasting in business is closely related to understanding the business cycle. The foundations of
modern forecasting were laid in 1865 by William Stanley Jevons, who argued that
manufacturing had replaced agriculture as the dominant sector in English society. He studied the
effects of economic fluctuations of the limiting factors of coal production on economic
development.
Forecasting has become big business around the world. Forecasters try to predict what the stock
markets will do, what the economy will do, what numbers to pick in the lottery, who will win
sporting events, and almost anything one might name. Regardless of who does it, forecasting is
done to identify what is likely to happen in the future so as to be able to benefit most from the
events.
QUALITATIVE FORECASTING MODELS
Qualitative forecasting models have often proven to be most effective for short-term projections.
In this method of forecasting, which works best when the scope is limited, experts in the
appropriate fields are asked to agree on a common forecast. Two methods are used frequently.
Delphi Method. This method involves asking various experts what they anticipate will happen
in the future relative to the subject under consideration. Experts in the automotive industry, for
example, might be asked to forecast likely innovative enhancements for cars five years from
now. They are not expected to be precise, but rather to provide general opinions.
Market Research Method. This method involves surveys and questionnaires about people's
subjective reactions to changes. For example, a company might develop a new way to launder
clothes; after people have had an opportunity to try the new method, they would be asked for
feedback about how to improve the processes or how it might be made more appealing for the
general public. This method is difficult because it is hard to identify an appropriate sample that is
representative of the larger audience for whom the product is intended.
QUANTITATIVE FORECASTING MODELS
Three quantitative methods are in common use.
Time-Series Methods. This forecasting model uses historical data to try to predict future events.
For example, assume that you are interested in knowing how long a recession will last. You
might look at all past recessions and the events leading up to and surrounding them and then,
from that data, try to predict how long the current recession will last.
A specific variable in the time series is identified by the series name and date. If gross domestic
product (GDP) is the variable, it might be identified as GDP2000.1 for the first-quarter statistics
for the year 2000. This is just one example, and different groups might use different methods to
identify variables in a time period.
Many government agencies prepare and release time-series data. The Federal Reserve, for
example, collects data on monetary policy and financial institutions and publishes that data in the
Federal Reserve Bulletin. These data become the foundation for making decisions about
regulating the growth of the economy.
Time-series models provide accurate forecasts when the changes that occur in the variable's
environment are slow and consistent. When large-degree changes occur, the forecasts are not
reliable for the long term. Since time-series forecasts are relatively easy and inexpensive to
construct, they are used quite extensively.
The Indicator Approach. The U.S. government is a primary user of the indicator approach of
forecasting. The government uses such indicators as the Composite Index of Leading, Lagging,
and Coincident Indicators, often referred to as Composite Indexes. The indexes predict by
assuming that past trends and relationships will continue into the future. The government indexes
are made by averaging the behavior of the different indicator series that make up each composite
series.
The timing and strength of each indicator series relationship with general business activity,
reflected in the business cycle, change over time. This relationship makes forecasting changes in
the business cycle difficult.
Econometric Models. Econometric models are causal models that statistically identify the
relationships between variables and how changes in one or more variables cause changes in
another variable. Econometric models then use the identified relationship to predict the future.
Econometric models are also called regression models.
There are two types of data used in regression analysis. Economic forecasting models
predominantly use time-series data, where the values of the variables change over time.
Additionally, cross-section data, which capture the relationship between variables at a single
point in time, are used. A lending institution, for example, might want to determine what
influences the sale of homes. It might gather data on home prices, interest rates, and statistics on
the homes being sold, such as size and location. This is the cross-section data that might be used
with time-series data to try to determine such things as what size home will sell best in which
location.
An econometric model is a way of determining the strength and statistical significance of a
hypothesized relationship. These models are used extensively in economics to prove, disprove,
or validate the existence of a casual relationship between two or more variables. It is obvious that
this model is highly mathematical, using different statistical equations.
For the sake of simplicity, mathematical analysis is not addressed here. Just as there are these
qualitative and quantitative forecasting models, there are others equally as sophisticated;
however, the discussion here should provide a general sense of the nature of forecasting models.
THE FORECASTING PROCESS
When beginning the forecasting process, there are typical steps that must be followed. These
steps follow an acceptable decision-making process that includes the following elements:
1. Identification of the problem. Forecasters must identify what is going to be forecasted, or
what is of primary concern. There must be a timeline attached to the forecasting period.
This will help the forecasters to determine the methods to be used later.
2. Theoretical considerations. It is necessary to determine what forecasting has been done
in the past using the same variables and how relevant these data are to the problem that is
currently under consideration. It must also be determined what economic theory has to
say about the variables that might influence the forecast.
3. Data concerns. How easy will it be to collect the data needed to be able to make the
forecasts is a significant issue.
4. Determination of the assumption set. The forecaster must identify the assumptions that
will be made about the data and the process.
5. Modeling methodology. After careful examination of the problem, the types of models
most appropriate for the problem must be determined.
6. Preparation of the forecast. This is the analysis part of the process. After the model to be
used is determined, the analysis can begin and the forecast can be prepared.
7. Forecast verification. Once the forecasts have been made, the analyst must determine
whether they are reasonable and how they can be compared against the actual behavior of
the data.
Each of the seven steps has substages; however, the steps that have been presented are the major
concerns to the forecaster. Those with a deep interest in forecasting might pursue more in-depth
treatments.
FORECASTING CONCERNS
Forecasting does present some problems. Even though very detailed and sophisticated
mathematical models might be used, they do not always predict correctly. There are some who
would argue that the future cannot be predicted at all period!
Some of the concerns about forecasting the future are that (1) predictions are made using
historical data, (2) they fail to account for unique events, and (3) they ignore coevolution
(developments created by our own actions). Additionally, there are psychological challenges
implicit in forecasting. An example of a psychological challenge is when plans based on
forecasts that use historical data become so confining as to prohibit management freedom. It is
also a concern that many decision makers feel that because they have the forecasting data in
hand they have control over the future.
Statistical inference
Population -collection of objects having some common characteristic of interest
under the consideration for a statistical investigation.
Sample- a finite subset of population.
Sample error- the inherent and unavoidable error caused while approximating the
characteristic of the object.
Random sample if n objects are selected from a population each of them are
equiprobable of getting selected.
Standard error-standard deviation of sampling distribution.
Confidence interval and confidence limits-In order to find the population mean we
cannot draw large number of the samples occurring in the entire population. So we
setup certain limits on both sides of the population mean on the basis that the mean
of samples are normally distributed around the population mean.These limits are
called confidence limits and range between the two is called the confidence interval.
The field of statistical inference consists of those methods used to make decisions or
draw conclusions about a population. These methods utilize the information
contained in a samplefrom the population in drawing conclusions.
Point Estimation
Hypothesis Testing
For example, suppose that we are interested in the burning rate of a solid propellant used
to power aircrew escape systems.
Now burning rate is a random variable that can be described by a probability
distribution.
Suppose that our interest focuses on the mean burning rate (a parameter of this
distribution).
Specifically, we are interested in deciding whether or not the mean burning rate is
50 centimeters per second.
Null hypotheses hypothesis which is being tested for possible rejection.
Alternative hypotheses-The hypothesis which is accepted when the null hypotheses
is rejected .
Critical region-The set of all those samples which lead to the rejection of null
hypothesis.
Level of significance-is the probability of rejection of null hypothesis when it is
actually true.
Two-sided Alternative Hypothesis
One-sided Alternative Hypotheses
Test of a Hypothesis
A procedure leading to a decision about a particular hypothesis
Hypothesis-testing procedures rely on using the information in a random sample
from the population of interest.
If this information is consistent with the hypothesis, then we will conclude that the
hypothesis is true; if this information is inconsistent with the hypothesis, we will
conclude that the hypothesis is false.
The power is computed as 1 - b, and power can be interpreted as the probability of
correctly rejecting a false null hypothesis. We often compare statistical tests by
comparing their power properties.
For example, consider the propellant burning rate problem whenwe are testing H
0
: m = 50 centimeters per second against H
1
: m not equal 50 centimeters per second
. Suppose that the true value of the mean is m = 52. When n =10, we found that b =
0.2643, so the power of this test is 1 - b = 1 - 0.2643 = 0.7357 when m = 52
General Procedure for Hypothesis Testing
T test
In many real life problems population mean is known the exact population standard
deviation cant be calculated.In such cases t test is used.
Sample size of 30-40
Types
One sample t test is used to compare the mean of a single sample with the
population mean.
An economist wants to know if the per capita income of a particular region is same
as the national average.
Independent sample t test-detecting differences between the means of two
independent groups.
An economist wants to compare the per capita income of two different region.
Z test
For z test population mean and population standard deviation should be known.
Large sample size.
Analysis of variance
ANOVA is used to compare the means of more than two population
Extensive application in consumer behavior and marketing management related
problems.
A marketing manager wants to investigate the impact of different discount schemes
on sale of three major brands of edible oil.
F statistic
ANOVA uses F statistic ,which tests if the means o the groups formed by one
independent variable or combination of independent variable are significantly
different.It is based on the comparison of variance.
Condition- dependent variable should be interval or ratio,the population should be
normally distributed.
Chi square test
One of the popular methods for testing hypothesis on discrete data.
Is used to test the hypothesis that two categorical variables are independent of each
other.
An organizations research wants to determine if the satisfaction level of the firm is
dependent on their placements