Академический Документы
Профессиональный Документы
Культура Документы
Fall 2015
Program:
MBA
Semester:
1st
Subject Code: MB0040
Subject Name: Statistics for Management
Q1- Statistics plays a vital role in almost every facet of human life. Describe the
functions of Statistics. Explain the applications of statistics.
Ans:
Meaning of Statistics: The practice or science of collecting and analysing numerical data in
large quantities, especially for the purpose of inferring proportions in a whole from those in a
representative sample.
Functions of statistics:
1. Presents facts in simple form: Statistics presents facts and figures in a definite form. That
makes the statement logical and convincing than mere description. It condenses the whole
mass of figures into a single figure. This makes the problem intelligible.
2. Reduces the Complexity of data: Statistics simplifies the complexity of data. The raw
data are unintelligible. We make them simple and intelligible by using different statistical
measures. Some such commonly used measures are graphs, averages, dispersions, skewness,
kurtosis, correlation and regression etc. These measures help in interpretation and drawing
inferences. Therefore, statistics enables to enlarge the horizon of one's knowledge.
3. Facilitates comparison: Comparison between different sets of observation is an important
function of statistics. Comparison is necessary to draw conclusions as Professor Boddingtons
rightly points out. the object of statistics is to enable comparison between past and present
results to ascertain the reasons for changes, which have taken place and the effect of such
changes in future. So to determine the efficiency of any measure comparison is necessary.
Statistical devices like averages, ratios, coefficients etc. are used for the purpose of
comparison.
4. Testing hypothesis: Formulating and testing of hypothesis is an important function of
statistics. This helps in developing new theories. So statistics examines the truth and helps in
innovating new ideas.
5. Formulation of Policies: Statistics helps in formulating plans and policies in different
fields. Statistical analysis of data forms the beginning of policy formulations. Hence,
statistics is essential for planners, economists, scientists and administrators to prepare
different plans and programmes.
6. Forecasting: The future is uncertain. Statistics helps in forecasting the trend and
tendencies. Statistical techniques are used for predicting the future values of a variable. For
example a producer forecasts his future production on the basis of the present demand
conditions and his past experiences. Similarly, the planners can forecast the future population
etc. considering the present population trends.
7. Derives valid inferences: Statistical methods mainly aim at deriving inferences from an
enquiry. Statistical techniques are often used by scholars planners and scientists to evaluate
different projects. These techniques are also used to draw inferences regarding population
parameters on the basis of sample information.
Applications of statistics:
Actuarial science is the discipline that applies mathematical and statistical methods to
assess risk in the insurance and finance industries.
Astrostatistics is the discipline that applies statistical analysis to the understanding of
astronomical data.
Biostatistics is a branch of biology that studies biological phenomena and
observations by means of statistical analysis, and includes medical statistics.
Business analytics is a rapidly developing business process that applies statistical
methods to data sets (often very large) to develop new insights and understanding of
business performance & opportunities
Chemometrics is the science of relating measurements made on a chemical system or
process to the state of the system via application of mathematical or statistical
methods.
Demography is the statistical study of all populations. It can be a very general science
that can be applied to any kind of dynamic population, that is, one that changes over
time or space.
Econometrics is a branch of economics that applies statistical methods to the
empirical study of economic theories and relationships.
Environmental statistics is the application of statistical methods to environmental
science. Weather, climate, air and water quality are included, as are studies of plant
and animal populations.
Epidemiology is the study of factors affecting the health and illness of populations,
and serves as the foundation and logic of interventions made in the interest of public
health and preventive medicine.
Geostatistics is a branch of geography that deals with the analysis of data from
disciplines such as petroleum
geology,hydrogeology, hydrology, meteorology, oceanography, geochemistry, geogra
phy.
Machine Learning
Operations research (or Operational Research) is an interdisciplinary branch of
applied mathematics and formal science that uses methods such as mathematical
modelling, statistics, and algorithms to arrive at optimal or near optimal solutions to
complex problems.
Population ecology is a sub-field of ecology that deals with the dynamics of species
populations and how these populations interact with the environment.
Psychometric is the theory and technique of educational and psychological
measurement of knowledge, abilities, attitudes, and personality traits.
Quality control reviews the factors involved in manufacturing and production; it can
make use of statistical sampling of product items to aid decisions in process control or
in accepting deliveries.
Quantitative psychology is the science of statistically explaining and changing mental
processes and behaviours in humans.
Reliability Engineering is the study of the ability of a system or component to perform
its required functions under stated conditions for a specified period of time
Statistical finance, an area of econophysics, is an empirical attempt to shift finance
from its normative roots to a positivist framework using exemplars from statistical
physics with an emphasis on emergent or collective properties of financial markets.
Q2-
Ans:
a):
1. The classical definition: Let the sample space (denoted by ) be the set of all possible
distinct outcomes to an experiment. The probability of some event is
2. The relative frequency definition: The probability of an event is the proportion (or
fraction) of times the event occurs in a very long (theoretically infinite) series of repetitions
of an experiment or process. For example, this definition could be used to argue that the
probability of getting a 2 from a rolled die is
Ans:
a):
Step 1: State the Null Hypothesis.
The null hypothesis can be thought of as the opposite of the "guess" the
research made (in this example the biologist thinks the plant height will be
different for the fertilizers). So the null would be that there will be no
difference among the groups of plants. Specifically in more statistical
language the null for an ANOVA is that the means are the same. We state
the Null hypothesis as:
H0:1=2==k
Step 2: State the Alternative Hypothesis.
HA: treatment level means not all equal
HA: treatment level means not all equal
The reason we state the alternative hypothesis this way is that if the Null is rejected, there are
many possibilities.
For example, 12==k12==k is one possibility, as
is 1=23==k1=23==k. Many people make the mistake of stating the
Alternative Hypothesis as: 12k12k which says that every mean differs
from every other mean. This is a possibility, but only one of many possibilities. To cover all
alternative outcome
Step 3: Set .
If we look at what can happen in a hypothesis test, we can construct the following
contingency table:
In Reality
Decision
H0 is TRUE
H0 is FALSE
Accept H0
OK
Type II Error
= probability of Type II Error
Reject H0
Type I Error
= probability of Type I Error
OK
this critical value is the minimum value for the test statistic (in this case the F test) for us to
be able to reject the null.
The F distribution, FF, and the location of Acceptance / Rejection regions are shown in the
graph below:
customs of people. More woollen clothes are sold in winter than in the season of summer
regardless of the trend we can observe that in each year more ice creams are sold in summer
and very little in winter season. The sales in the departmental stores are more during festive
seasons that in the normal days.
Cyclical variations: Cyclical variations are recurrent upward or downward movements in a
time series but the period of cycle is greater than a year. Also these variations are not regular
as seasonal variation. There are different types of cycles of varying in length and size. The
ups and downs in business activities are the effects of cyclical variation. A business cycle
showing these oscillatory movements has to pass through four phases-prosperity, recession,
depression and recovery. In a business, these four phases are completed by passing one to
another in this order.
Irregular variation: Irregular variations are fluctuations in time series that are short in
duration, erratic in nature and follow no regularity in the occurrence pattern. These variations
are also referred to as residual variations since by definition they represent what is left out in
a time series after trend, cyclical and seasonal variations. Irregular fluctuations results due to
the occurrence of unforeseen event
Q4- a) What is a Chi-square test? Point out its applications. Under what conditions is
this test applicable?
Ans:
a): Chi Square: Statistical method assessing the goodness of fit between a set of observed
values and those expected theoretically.
ApplicationChi-square test for categorical variables determines whether there is a difference in the
population proportions between two or more groups. In the medical literature, the Chi-square
is used most commonly to compare the incidence (or proportion) of a characteristic in one
group to the incidence (or proportion) of a characteristic in other group(s).
For example, you might use the Chi-Square test to compare the incidence PONV between
patients that received ondansetron, patients that received droperidol, and patients that
received a placebo.
This approach consists of four steps:
1. State the hypotheses: Every hypothesis test requires the analyst to state a null
hypothesis (H0) and an alternative hypothesis (Ha). The hypotheses are stated in such
a way that they are mutually exclusive. That is, if one is true, the other must be false;
and vice versa.
2. Formulate an analysis plan: The analysis plan describes how to use sample data to
accept or reject the null hypothesis. The plan should specify the following elementsSignificance level > Often, researchers choose significance levels equal to 0.01, 0.05,
or 0.10; but any value between 0 and 1 can be used.
Test method > Use the chi-square goodness of fit test to determine whether observed
sample frequencies differ significantly from expected frequencies specified in the null
hypothesis.
3. Analyze sample data: Using sample data, find the degrees of freedom, expected
frequency counts, test statistic, and the P-value associated with the test statistic.
Degrees of freedom. The degrees of freedom (DF) is equal to the number of
levels (k) of the categorical variable minus 1: DF = k - 1
Expected frequency counts. The expected frequency counts at each level of the
categorical variable are equal to the sample size times the hypothesized
proportion from the null hypothesis Ei = npi
Where Ei is the expected frequency count for the ith level of the categorical
variable, n is the total sample size, and pi is the hypothesized proportion of
observations in level i.
Test statistic. The test statistic is a chi-square random variable (2) defined by
the following equation.
2 = [ (Oi - Ei) 2 / Ei ]
Where Oi is the observed frequency count for the ith level of the categorical
variable, and Ei is the expected frequency count for the ith level of the
categorical variable.
P-value. The P-value is the probability of observing a sample statistic as
extreme as the test statistic. Since the test statistic is a chi-square, use the ChiSquare Distribution Calculator to assess the probability associated with the test
statistic. Use the degrees of freedom computed above.
4. Interpret results: If the sample findings are unlikely, given the null hypothesis, the
researcher rejects the null hypothesis. Typically, this involves comparing the P-value
to the significance level, and rejecting the null hypothesis when the P-value is less
than the significance level.
b): Types of measurement scales:
There are four measurement scales (or types of data): nominal, ordinal, interval and ratio.
These are simply ways to categorize different types of variables
Nominal- Nominal scales could simply be called labels. Scales are mutually exclusive (no
overlap) and none of them have any numerical significance. A good way to remember all of
this is that nominal sounds a lot like name and nominal scales are kind of like names or
labels.
Ordinal- With ordinal scales, it is the order of the values is whats important and significant,
but the differences between each one is not really known. In each case, we know that a #4 is
better than a #3 or #2, but we dont knowand cannot quantifyhow much better it is. For
example, is the difference between OK and Unhappy the same as the difference between
Very Happy and Happy? We cant say. Ordinal scales are typically measures of nonnumeric concepts like satisfaction, happiness, discomfort, etc. Ordinal is easy to remember
because it sounds like order and thats the key to remember with ordinal scalesit is
the order that matters, but thats all you really get from these.
Interval- Interval scales are numeric scales in which we know not only the order, but also the
exact differences between the values. The classic example of an interval scale
is Celsius temperature because the difference between each value is the same. For example,
the difference between 60 and 50 degrees is a measurable 10 degrees, as is the difference
between 80 and 70 degrees. Time is another good example of an interval scale in which
the increments are known, consistent, and measurable. Interval scales are nice because the
realm of statistical analysis on these data sets opens up. For example, central tendency can be
measured by mode, median, or mean; standard deviation can also be calculated.
Ratio- Ratio scales are the ultimate nirvana when it comes to measurement scales because
they tell us about the order, they tell us the exact value between units, and they also have an
absolute zerowhich allows for a wide range of both descriptive and inferential statistics to be
applied. At the risk of repeating myself, everything above about interval data applies to ratio
scales + ratio scales have a clear definition of zero. Good examples of ratio variables include
height and weight.
Q5- Business forecasting acquires an important place in every field of the economy.
Explain the objectives and theories of Business forecasting.
Ans: Business forecasting refers to the analysis of past and present economic conditions with
the object of drawing inferences about probable future business conditions. the process of
making definite estimates of future course of events is referred to as forecasting and the
figure or statements obtained from the process is known as forecast; future course of events
is rarely known.
The Objectives of Business Forecasting:
In the narrow sense, the objective of forecasting is to produce better forecasts. But in the
broader sense, the objective is to improve organizational performancemore revenue, more
profit, increased customer satisfaction. Better forecasts, by themselves, are of no inherent
value if those forecasts are ignored by management or otherwise not used to improve
organizational performance.
Theories of Business Forecasting:
1. Sequence or time-lag theory - It is the most important theory of business forecasting.
It is totally based on the assumption that most of the business data have the lead
relationship changes in business are successive and not simultaneous. There is a timelag between different movements, for example, the expenditure on advertisement may
not at once lead to increase in sales. Similarly, when government makes use of deficit
financing it leads to inflationary pressure-the purchasing power of people goes up-the
wholesale prices. The retail prices start increasing. With the rise in retail prices the
cost of living goes up and with it there is a demand for increased wage. Thus, one
factor, more money in circulation, has affected different fields of economic activity
not simultaneously but successively, similarly, when the excise duties are increased by
the government they result in increases in prices which would lead to higher demand
for wages.
2. Action and reaction theory- This Action and reaction theory is based on two
assumptions; every action has a reaction and magnitude of the original action
influences the reaction. Thus if the price of Rice has gone up above a certain level in a
certain time period, there is likelihood that after some time it will go down below the
normal level, thus, according to this theory a certain level of business activity is
normal-sub normal or abnormal conditions cannot remaining so for ever- there is a
bound to be reaction to them. And hence, we find four phases of a business cycle:
Prosperity, Decline, Depression and Improvement.
3. Economic rhythm theory- The basic assumption of this theory is that the history
repeats itself and hence the exponents of this theory believe that economic phenomena
behave in a rhythmic order. Cycles of early the similar intensity and duration tend to
recur. Thus, the available historical data have to be analyzed into their component
parts and various types of fluctuations. Influencing them has to be segregated. A trend
is then obtained that will represent a long-term tendency or growth of decline. This
trend line is projected a number of years into the future either by the freehand
technique or by the mathematical technique. This is completed on the assumption that
the trend line represents the normal growth or decline of the series.
4. Specific historical analogy- This theory is hazed on a more realistic assumption, that
all the business cycles are not uniform in amplitude or duration and as such the use of
history is made not by projecting any economic rhythm into the future, but by
selecting some specific previous conditions which has many of the earmarks of the
present and concluding that what happened in the earlier situation will happen in the
present one also. What is done is that a time series relating to a data in question is
thoroughly scrutinized and from it such period is selected in which conditions were
similar to those prevailing at the time of making idea of the likely course which the
phenomenon in question would follow. For example after world war many people
forecast a depression because of world war.
5. Cross-section analysis- This theory is based on the knowledge and interpretation of
the current forces rather than projection of past trends. The theory assumes that no
two cycles are similar. By the like causes always give like results. All the factors
bearing upon a given condition are assembled and relying upon the knowledge of
economic processes. The forecaster concludes whether the condition is favourable or
not, immediate recognition is given to the fact that business conditions are shaped by
simultaneous inflationary and deflationary forces. The Predominance of inflationary
forces results in booms, whereas predominance of deflationary forces leads to
depression. The forecaster who utilizes this technique enumerates stable forces and a
third which sets forth deflationary forces on the basis of judgment.
Q6-
Ans:
a):
Analysis of variance (ANOVA): It is a collection of statistical models used to analyze the
differences among group means and their associated procedures (such as "variation" among
and between groups)
Assumptions for ANOVA
Each group sample is drawn from a normally distributed population.
b): Solution:
Let H0: There is no significant difference in the means of three samples
.
X1
8
10
7
14
11
X1 = 50
X2
7
5
10
9
9
X2 = 40
X3
12
9
13
12
14
X3 = 60
SSC
ANOVA Table:
Source of
Variation
Sum of Squares
df
Mean Square
F- value
Between
SSC = 40
MSC = 20
Fcal = 20/5= 4
Within
SSE = 60
12
MSE = 5
Total
TSS = 100
14
F table value for degrees of freedom (2, 12) [v1 = 2, v2 = 12] at 5% level of significance is
3.88. Since F table value is smaller than the F calculated value, we reject the null hypothesis
and conclude that sample means are not equal.