Business Statistics

Business Statistics
Contents
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. Meaning and Scope Collection of Data Classification and tabulation Diagrammatic and Graphic Representation Averages Dispersion Skewness and Kurtosis Correlation Linear Regression Analysis Index Numbers Time series Analysis Theory of Probability Random Variable, Probability Distribution and Mathematical expectation Theoretical Distributions Sampling Theory and Design of sample Surveys Interpolation and Extrapolation
Quantitative Decision Making
Learning Objectives
Basic Statistics and its application in dayto-day life of a Manager
Various aspects of quantitative techniques and their application in Decision making Also frequently used models of Statistical analysis
Understand: Complexity of Managerial decisions Quantitative Techniques Need of using Quantitative approach in decisions Role of statistical methods in data analysis Brief idea of various statistical methods Know the areas of applications of quantitative approach in business and management.
Introduction
Individual business prior to Industrial revolution and need for info----Decisions based on past experience and intuition.
Marketing of products Test marketing of products The manager (also the owner) Progress of work Any other fact the owner needed to know
Intuition alone has no place in decision making

Becomes highly questionable when decisions involve the choice among several courses of action each of which can achieve several management objectives simultaneously.
Also in:

Statistical methods used in Marketing, Finance, Production and personnel

Regional planning Transportation Public health Communication Military agriculture
QT: A group of statistical , and OR (programming) Techniques

QT approach in decision making :
Problems be defined, analyzed and solved in a conscious, rational, systematic, scientific manner based on ;
Data, facts, info, and logic (and not whims and guesses)
QT provides decision maker a scientific method based on quantitative data in identifying a course of action to achieve the optimal value of the predetermined objective or goal. Usage of numbers , symbols or mathematical formulae are used to represent the models of reality.
Statistics and different senses

Statistical Data
Numerical or quantitative aspects
Statistical Methods
Collect, organize /classify, present, analyze and interpret
Functions of Statistical Methods

Data Collection Organize: segregate/condense Presentation: orderly manner: graphs/charts Analysis Interpretation
examples
Statistics: Characteristics of Data: Common to refer data in quantitative form as Data.

Not all numerical data is statistical. For numerical description to be statistics:
Aggregate of facts Affected to a marked extend by multiplicity of causes (controllable/uncontrollable) Enumerated or estimated according to reasonable standard of accuracy. Collected in a systematic manner for a pre-determined purpose. Placed in relation to each other Numerically expressed
Types of Statistical data

Secondary Primary
OR : a mathematical model to represent the situation under study.

Helps to:
Either to predict the performance of a system Or determine the action or control needed to optimize the performance.
Classification of Statistical Methods into three categories

Descriptive Statistics
Data Collection Presentation
Inductive statistics
Statistical inference Estimation
Statistical decision Theory

Analysis of business Decision
Used for re-arranging, grouping, and summarizing sets of data
Changes in price index, Yield by wheat using different charts and graphs
having large quantities of numerical data for easy understanding Various types of averages, central tendency and dispersion, trends, index numbers.
Inductive Statistics
The development of some criteria which can be used to derive info about the nature of entire population or universe from the nature of the small sample.
Include :
probability, probability distribution, sampling and sampling distribution, various methods of testing hypothesis :correlation, regression, factor analysis, time series analysis.
Statistical Decision Theory; 4 different states of decision environment

State of decision and Consequence
Certainty: Deterministic Risk: Probabilistic Uncertainty: Unknown Conflict: Influenced by an opponent
Subjective approach (uses probabilities) Also known as Bayesian approach,
Models in OR
Based on Purpose:
Descriptive: behavior of a system ( Behavior of demand of an inventory item) Explanatory, : Explain behavior with relationships( wages, promotion policy,) Predictive: predict stock prices for given any level of earning per share. Prescriptive (normative): norms for comparison of alternate solutions (Allocation).
Based on Degree of Abstraction

Physical, Graphic, Schematic, Analog, Mathematical
Based on Degree of certainty, and risk

Deterministic: Linear programming, transportation and assignment models Probabilistic: simulation models, decision theory
Based on Specified behavior characteristics

Static, Dynamic, Linear, Non-linear
Based on Procedure (method) of solution

Analytical, Simulation
Classification of models help in understanding the nature and role of models

Abstract or Physical
Static : linear programming Dynamic model
Linear or non-linear
Stable, unstable unstable( Constrained) Unstable (explosive) Transient steady state, Transient (non existent)
Ref:
Various Statistical Techniques

Measure of Central tendency Measure of Dispersion: Correlation
Regression analysis:
Time Series Analysis
Index Numbers
Sampling and Statistical Inference
Measure of Central tendency

Mean:
common arithmetic average
Divide the sum of the values of observation s by number of items observed.
Median:
Item lies exactly half way between the lowest and highest values when they are arranged in ascending/descending order. Not affected by value of observation
Divides the number of households into two equal parts. (50% of all households have income below median income)
Mode:
Category that has max number of observation, (that occurs more frequently)
Measure of Dispersion:
spread away from central tendency (mean/mode/median) :
Range, mean deviation, Standard deviation. The data spread in symmetrical or asymmetrical pattern: skewness
Frequency distribution in the shape of a peak: measure called: Kurtosis
Correlation
Dependent variable associated with changes in other independent variable.
Sales as depended variable and advertising budget as an independent.
Could be casual or causal relationships
Regression analysis: determining casual relationship between two variables

Use of Multi-variate statistical techniques for determining casual relationships involving two or more variables:
Multi-regression analysis, Discriminant analysis, factor analysis
Time Series Analysis

A set of data (arranged in some desired manner) recorded either at successive points in time or over successive periods of time. The changes considered as a resultant of combined affect of a force
The force components:
Editing time series data Secular trend Periodic changes (cyclical/seasonal variations) Irregular or random variators.
Cost of living, growth of agricultural /food production, seasonal requirements of items, impact of war, strikes
Index Numbers: a relative number representing net result of change in a group of variables
Stated in percentages given or current year, and base year
production, sales price, volume of employment,
Sampling and Statistical Inference

Sampling for reasons Schemes for drawing samples are classified as :
Random Sampling Schemes
Every element has an equal chance (probability) of being selected
Non-random sampling schemes

Drawing samples based on choice or purpose of selectors
Sampling analysis using various tests : Z normal distribution Students t distribution, F distribution X^2 distribution
Advantages to Management
Definiteness Condensation Comparison Formulation of policies Formulating and testing hypothesis Prediction
Application of techniques in Business and Management

Management
Marketing Production Finance, accounting and Investment Personnel
Economics Research and Development Natural science
Marketing
Marketing research info Building and maintaining an extensive market Sales forecasting
Production
PPC and analysis Machine performance evaluation QC Inventory control
Finance, accounting and Investments

Financial forecast, budget preparation Fin Investment decision Selection of securities Auditing function Credit policies, credit risk, delinquent account
Personnel
Labour turnover rate Employment trends Performance appraisal Wage rates and incentive plans
Economics
Measurement of Gross National Product and inputoutput analysis Determination of business cycles, seasonal fluctuations Comparison of market price, cost and profit of individual firm Analysis of population, Operational studies of Public utilities Formulation of appropriate economic policies and evaluation of their effects
Research and Development

Development of new product lines Optimal use of resources Evaluation of existing products
Natural science
Diagnosing based on inputs Efficacy of certain drugs Study of plant life
Exercise/ Assignments
1. Comment on the statement: Statistics are numerical statements of facts, but all facts numerically stated are not statistics 2. Explain the distinction between : Descriptive and Prescriptive models
1. Presentation topic:
1. Formulate a business problem and analyze it by applying the major phases of statistics
Functions and Progressions
Learning Objectives:
Insight into different aspects of the types of functional relationships among business variables Their applications in various fields of management
Need to Identify/define relationships among business variables Define functional relationships Various types of functional relationships Use of graph to depict functional relationships Managerial applicability Progression and application..
Introduction
For decision problems which use mathematical tools, the first requirement is to identify or formally define all significant interactions or relationships among primary factors (also called variables). The relationships usually are stated in the form of an equation or inequation.
Study mathematical problems in the context of managerial problem
Definitions
Variables: A variable is something whose magnitude can vary or which can assume various values. Represented by symbols (first letter of the name)
Discrete variable: suspect to counting (houses, machines) Continuous Variables: suspect to measurements (temp, height)
Constant and Parameters:

A constant: Remains fixed in the context of a given problem or situation
An Absolute ( or numerical) Constant retains same value in all problems
Absolute ( or numerical) value of b is denoted by lbl regardless of its algebraic sign. lbl=l-bl
An Arbitrary (or parametric) constant or parameter retains same value throughout any particular problem, but may assume different values in different problems
P21 (ex1)
Types of Function
Linear Functions:
The power of independent variable is 1 A function with only one independent variable is called a Single variable function.
(P21(1)
A single variable function can be linear or non-linear.

(p 22)
A linear function with one variable can always be graphed in two dimensional plane (or space). The graph of such functions is always a straight line. (P22ex2
Polynomial functions:
Polynomial function of degree 1 is called a linear function Polynomial function of degree 2 is called a Quadratic function (p23-ab
Absolute Value Functions : ( p23(3 Inverse Function: (P 23 Step function: For different values of an independent variable x in an interval the depended variable y=f(x) takes a constant value, but takes different values in diff intervals. (p24-5) Algebraic and Transcendental functions
Activity
P 25 activity B -1a&b assignment
Business Application
Linear Function ( P27-ex3 assignment Quadratic function ( P27-ex4 assignment
Activity D (Page 28-b_assignment
Sequence and Series

If for every positive integer,n, --------related to some number-----sequence
Installment buying, simple and compound interest problems Annuities and present values Mortgage payments
Arithmetic progression (AP)

Arithmetic progression: A sequence whose term increases or decreases by a constant number called Common difference of an AP and is denoted by d
P29 ex6 assignment
Geometric progression (GP)

A geometric progression: A sequence whose term increases or decreases by a constant ratio called Common ratio of an AP and is denoted by d
P29 ex7 assignment P31 ex 8
Concept of Maxima and Minima with managerial applications

Page 55 ex18 assignment
Data Collection and analysis
Contents
Collection of data:
Need and significance of data collection Primary and secondary data Different methods of collecting primary data Edit primary data and know sources of secondary data and its use Census versus sample
Classification and presentation of collected data
Treatment of data through central tendency measurements,

Deviations and different measures of variation.
Introduction
The need for data collection
Statistical data is a set of facts expressed in quantitative form. The use of facts expressed as measurable quantities can help a decision maker to arrive at better decisions.
Primary and Secondary Data

Distinguish between Primary and------
Methods of collecting Primary Data

Observation Questionnaire
Personal interview Mail Telephone
Designing/Preparing questionnaire Pre-testing a questionnaire Editing the primary data.
Important points in Designing a questionnaire

Covering letter Number of questions to be minim (15-40) Simple, short, and unambiguous Sensitive and personal nature be avoided Answer to questionnaires should not require calculations Logical arrangement Crosscheck and footnotes
Editing Primary Data to ensure:

completeness Consistency Accuracy Homogeneity
Sources of secondary data

Published Sources Unpublished Sources
Precautions in use of secondary Data

Because of bias, inadequate sample size, errors of definitions, computational errors Hence to consider:
Suitability Reliability Adequacy
Census (complete enumeration) and Sample

Advantages and disadvantages of census (Physical destruction)
Exercises/Assignments
1. Distinguish between Primary and Secondary data. Indicate the situations in which each of these----? 2. Distinguish between census and sampling methods of data collection. Compare merits/demerits. Why sampling unavoidable in certain situations.
Presentation of Data
Presentation of Data
Learning objectives
Understand the need and significance of presentation of data Necessity of classifying data and various types of classification Construct frequency distribution of discrete and continuous data Frequency distribution in the form of :bar diagrams, histograms, frequency polygon, and ogives
Classification Discrete frequency Distribution Continuous frequency distribution Choosing the classes Cumulative and Relative frequencies Charting data
Introduction
After the understanding various ways of data collection:
The successful use of Data collected depends on:
The manner in which it is arranged, displayed and summarized.
Presentation of data can be displayed either in tabular form or through charts

In tabular form , it is necessary to classify the data before the data is tabulated. Hence to understand: classification , tabulation and charting of data.
Classification of data
After the data has been systematically collected and edited, The first step in presentation of data is Classification
Classification is the process of arranging the data according to points of similarities and dissimilarities
Principal objectives of classification

To condense the mass of data in such a way that salient features can be easily noticed To facilitate comparisons between attributes of variables To prepare data to be presented in tabular form To highlight significant features of data at a glance
Some Common Types of Classification

Geographical Classification
Production of wheat state-wise
Chronological Classification
Sales figures of a company for last six years
Qualitative Classification
Dichotomous Classification
An attribute divided into two classes, one possessing and the other not possessing it (basis of employment)
Manifold Classification : divided into several classes (educational level)
Quantitative Classification : according to characteristics that can be measured (employees as per monthly salaries)
Discrete : limited to certain numerical value of a variable Continuous: Take all values of the variable
Examples
Chronological classification Discrete frequency distribution Continuous frequency distribution
P14,15
Construction of a Discrete Frequency distribution

Place all possible values of the variable in ascending order in one column Then prepare another column of Tally mark to count the number of times a particular value of the variable is repeated
To facilitate counting use blocks of 5 Tally marks with a space left in-between blocks
The frequency column refers to numbers of tally marks, a particular class will contain
p15
Construction of a Continuous Frequency distribution

Class limits: 60-69: lower and upper limits, lowest and highest Class intervals: width, span or size20-10=10 Class frequency: The number of observation falling within a particular class is called , class frequency or frequency. Total frequency (sum of all frequencies) indicate the total number of observations considered in a given frequency distribution. Class mid-point: sum of two successive lower points divided by 2.
Assignments
1. What do you understand by classification of data? 2. Why classification of data is required? 3. Illustrate the difference between qualitative and quantitative data.
Types of class interval: Methods

Exclusive and Inclusive (on whether upper limit is included or excluded) ----(p16) Open end (p17)
Generally opt for exclusive method But If Inclusive is suggested, minor adjustments required to determine class interval
Correction factor: Lower limit of second class-upper limit of first class, divided by 2 Deduct the correction value from lower limit and add to upper limit
Guidelines for choosing the class

The number of classes should not be too small or too large (5 to 15) If possible Values of widths of interval should be numerically simple like 5, 10, 25 (values like3,7,9 be avoided It is desirable to have classes of equal width, (classes with unequal class interval can be formed, like in income distribution) The starting point of a class should begin with 0,5,10, or multiples of. ( eg 3-13 not allowed) Class interval should be determined, considering, min max value and the number of classes to be formed
(p18)
Activity
Distinguish between:
1. Discrete and continuous frequency distribution 2. Class limits and class intervals 3. Inclusive and exclusive methods
Cumulative and Relative frequencies

Rather than listing the actual frequency opportunity each class , it may be appropriate to list either cumulative frequencies or relative frequencies or both.
Cumulative frequencies: cumulates the frequencies, starting from either lowest or highest values. (p18-19) Relative Frequencies: Very often, the frequencies in a frequency distribution are converted to relative frequencies to show percentage for each class. The frequency of class is divided by the total number of observations (total frequency).To get the percentage for each class, multiply the relative frequency by 100. (p19)
Important advantages in looking at Relative frequencies (percentages)

1. Facilitates a comparison of two or more sets of data. 2. Constitute the basis for understanding the concept of probability.
Activity
Explain the concept of relative frequency
Charting of Data
Popular Methods of Charting frequency distribution

Bar Diagram Histogram Frequency Polygon Ogive or Cumulative frequency curve
Bar diagram
Most popular Example: Population, per capita income, sales and profits A bar is a thick line whose width is shown to attract the viewer. A bar diagram may be either vertical or horizontal.
DRAWING A BAR DIAGRAM:
Take characteristic (or attributes) under consideration on X-axis and the corresponding value on the Y-axis. It is desirable to mention the value depicted by the bar on the top of the bar. The gap between one bar and the other is kept equal. Also width of bars are same. The only difference is in length of the bars.
That is why this type of diagrams are known as one dimensional. (P20)
Histograms
One of the most commonly used and easily understood methods of graphic representation of frequency distribution. A histogram is a series of rectangles having areas that are in the same proportion as the frequencies of a frequency distribution
CONSTRUCTING HISTOGRAM:
On horizontal axis or X-axis, we take class limits of variables, and on vertical axis or Y-axis, we take frequencies of class intervals shown on horizontal axis If class intervals are of equal width, then the vertical bars of equal widths.(P20-21) On the other hand if the class intervals are unequal , the frequencies have to be adjusted according to width of class interval (P 21-22)
Activity
Draw a sketch of a histogram and a bar diagram and explain the difference between the two.
Frequency Polygon
A graphical presentation of frequency distribution A polygon is a many sided closed figure,
A frequency polygon is constructed by:
taking the mid points of upper horizontal points of each rectangle on the histogram and connecting these mid-points by straight lines. In order to close the polygon, an additional class is assumed at each end, having zero frequency. (p22-23)
The histogram is usually associated with discrete data and a frequency polygon is appropriate for continuous data. (But the distinction is not always followed) The frequency polygon and frequency curve have a special advantage over histogram particularly when to compare two or more frequency distributions
Activity
What is the procedure for making a frequency polygon? Illustrate.
Ogives or Cumulative frequency Curve

A graphical presentation of a cumulative frequency distribution .
There are two methods:
Less than ogive:
The upper limits of various classes are taken on X-axis, and frequencies obtained by the process of cumulating the preceding frequencies on Yaxis.By joining these points we get less than ogive
More than ogive.

By taking lower limits on X-axis and cumulative frequencies on the Yaxis.by joining these points we get more than ogive.
The shape of less than ogive curve will be a rising one, Whereas the shape of more than ogive curve wood be a falling one
Activity
With the help of an example , explain the concept of less than ogive and more than ogive.
Types of Data
Data refers to known facts or things used as basis for inference or reckoning. Types of Data:
Qualitative: concerned with qualities and non-numerical characteristics. Quantitative: concerned with numerical characteristics.
Discrete: take only one of a range of distinct values (no of employees). Continuous: take any value within a given range (time, length) (P160-161BR)
The Concept of Level of Measurements

Scales of Measurement
Nominal level (Classificatory/ named) Data: Ordinal level (Ranking/ordered) data: Interval level (Numerical) data Ratio level (Numerical) data: represent highest level of precision.
Nominal level (Classificatory/ named) Data: And Implications for Data handling Methodologies
Classification of data: Statements of equality or differences (according to variable occupation) Although mode could be used, very few statistics can be applied to data collected in this form
Ordinal level (Ranking/ordered) data: And Implications for Data handling Methodologies
Can be Classified in terms of of equality or differences Permit you to order individual data and make decisions such as this score is greater or lesser than another. (employee grades or choices ranked) Since arithmetic mean cannot be calculated , the use of many other statistics are also excluded.
Interval level (Numerical) data And Implications for Data handling Methodologies
Have characteristics of both Nominal and Ordinal scales, but also provides additional info regarding the degree of difference between individual data items within a set of group. Most measures of human characteristics have interval properties. (Interval between IQ Scores/ assignment marks) However precision in interval scale is limited. Also some statistics such as geometric mean are excluded from use with data collected in this form.
Ratio level (Numerical) data: represent highest level of precision. And Implications for Data handling Methodologies
A Mathematical number system (height, weight, time) Ratio Scale allow ratio as well as interval decision (allowing us to say something is so many times big/bright/heavy) Any statistics can be used on data collected in this form. (Some scales such as temp may appear to have ratio properties, but in fact are only interval scales) (Centigrade)
Parametric and non-parametric methods (assumptions about parameters of the data)

Associated with every data analytic method, there is a set of assumptions that underlie the use of that method. t-test (to compare the means of two samples of data) as one of the most popular (p133-RM)
non-parametric methods;
For research in social sciences in mind Valid for use with nominal or ordinal level. For very small samples (less than n.=10), though the power of any test weakens with very small samples.
Measures of central Tendency
Measures of central Tendency

Learning objectives:
Concept and significance of measures of central tendency. Computing: arithmetic mean, weighted arithmetic mean, median, mode, geometric mean, and harmonic mean. Computing several quantiles: quartiles, deciles, and percentiles Relationships among various averages.
Significance of measure of central tendency

The objective is to find one representative value which can be used to locate and summarize the entire set of varying values. To find some central value around which the data tend to cluster
Average income Average sales figure may be compared with that of another
Properties of a Good measure of central tendency

Easy to understand Simple to compute Based on all observations Uniquely defined Capable of further algebraic treatment It should not be unduly affected by extreme values.
Important measures of central tendency commonly used by Business and Industry.

arithmetic mean, weighted arithmetic mean, median, quantiles mode, geometric mean, harmonic mean.
Arithmetic Mean (or Mean or Average)

In statistics term average refers to any of the measure of central tendency
The Arithmetic mean is defined as being equal to the sum of numerical values of each and every observation divided by the total numbers of observations. Eg; Average monthly salary ..ungrouped data When observations are classified into a frequency distribution, The midpoint of a class interval would be treated as the representative average value of that class.
(P-31 .)
Mathmetical properties of Arithmetic mean

The sum of deviations of observations from AM is always zero The sum of squared deviations of observations from the mean is minimum Arithmetic means of several sets of data may be combined into a single AM for combined sets of data.
AM
Advantages:
Easily computed Readily understood Almost all properties of a good measure of central tendency.
Disadvantages
Distorted by Extreme values Open end distribution and assigning mid point value.
Weighted Arithmetic mean

Arithmetic mean gives equal importance (or weight) to each observation. In some cases all observations do not have same importance
Useful in problems relating to construction of index numbers.

P33,34
Median
Divides the distribution into two equal parts. 50% of the observations in distribution are above the value of median ------ The median is the value of the middle observation when the series is arranged in
P34,,35
Mathematical Property of Median

Sum of absolute deviations about the median is minimum Easy to determine and easy to explain Affected by number of observations and not by value of observation, hence less distorted as a representative value than AM It may be computed for an open- end distribution
Disadvantages:
Less familiar than AM As a positional average its values are not determined by each and every observation. Not capable of algebraic treatment
Quantiles
Related positional measures of central tendency The most familiar quantiles are
Quartiles:
Values which divide the total data into 4 equal parts Since 3 points divide the distribution into 4 equal parts, we have 3 quartile. Q1(25% of observations are smaller and ----), Q2,Q3
Deciles
Values which divide the total data into ten equal parts. Since 9 points divide the distribution into 10 equal parts, we have 9 Deciles denoted as D1, D2---D9
Percentiles:
Values which divide the total data into 100 equal parts. Since 9 9points divide the distribution into 100 equal parts, we have 99 percentiles denoted as P1, P2----P99 P36,37
Locating Quantiles graphically:

To locate median graphically, draw less than ogive (cumulative frequency curve), Take variables on X axis and frequency on Y axis Determine median value by locating N/2 observation on Y axis, Draw a horizo line to cum freq curve From where it meets, draw perp to X axis The point where it meets X axis is the median value.
Same way values of Q1---, D1---,P1---, etc can be found
p38
MODE
Most commonly observed value in a set of data----P39
Locating the mode graphically

Construct a histogram p40
Relationship among Mean, Median and Mode

A distribution in which mean, median and mode coincide is known as Symmetrical (bell shaped) distribution If a distribution is skewed, ( not symmetrical), then mean, median and mode are not equal. In a moderately skewed distribution, distance between mean and median is approx , one third the distance between mean and mode
Mode=3median-2mean p41
Geometric Mean
Geometric mean like arithmetic mean is a calculated average. Very useful in averaging ratios and percentages. Also in determining the rate of increase or decrease Also capable of further algebraic treatment
GM is more difficult to compute and interpret Cannot be computed if any observation has either a value zero or negative observations
Harmonic Mean
A measure of central tendency for data expressed as rates (km/hr, tonnes/day , Km/ltre) Defined as the reciprocal of arithmetic mean of reciprocal of individual observations.
Harmonic mean like arithmetic mean and geometric mean is computed from each and every observations It is specially used for averaging rates
Cannot be computed when on or more observations have zero value or when there are both positive and negative observations In dealing with business problems rarely used.
Measures of variation and skewness

After having understood various measures that are that are used to provide a single representative value of a given set of data, we know that this single value alone cant adequately describe a set of data. Hence we got to study two important characteristics of a distribution:
Variation Skewness
Measure of Variation( Dispersion)

A measure of variation (dispersion) describes the spread or scattering of the individual values around the central value.
Illustration (p47)
Significance of Measuring variation

1. Determines the reliability of an average by pointing out as to how far an average is representative of the entire data. 2. Determine nature and cause of variation in-order to control the variation itself 3. Enable comparisons of two or more distributions with regard to their variability. 4. Measuring variability is of great importance to advanced statistical analysis. (like in sampling or statistical inference)
Properties of a Good measure of variation

Should possess, as far as possible same properties as those of a good measure of central tendency. Some of the well known measures of variation which provide a numerical index of the variability of the given data are:
Range Average or mean deviation Quartile Deviation or Semi-Interquartile range Standard deviation
Absolute and Relative measures of variation

Measures of Absolute variation are expressed in terms of the original data. In cases two sets of data are expressed in different units of measurement, then the absolute measures of variation are not comparable. In such cases measures of relative variation are used. Also in cases:
Comparison between two sets of data having the same unit of measurement, but with different means.
Range
Difference between the highest (numerically large ) value and the lowest value in a set of data. R=H-L Range is very easy to calculate and gives us some idea about the variability of data. However, the range is a crude measure of variation , as it uses only two extreme values.
Concept of range utilized in SQC, in studying variations in prices of shares and debentures and other commodities that are very sensitive to price changes from one period to another. Also a good indicator in weather forecast
For grouped data, the range may be approximated as difference between upper limit of the largest class and the lower limit of the lowest class. The relative measure corresponding to range, called the coefficient of range , is obtained by applying formula P48,49
Quartile deviation or Semi-interquartile range

Computed by taking the averages of the difference between the third quartile and the first quartile. The relative measure corresponding to quartile deviation, called coefficient of quartile deviation.
QD is superior to range as it is not based on two extreme values, but rather on middle 50% observations. Another advantage of QD is that it is the only measure of variability which can be used for open-end distribution.
The disadvantage is that it ignores the first and last 25% observations.
P49,50
Average Deviation or Mean Deviation

Is an improvement over the previous two measures in that it considers all observations in the given set of data. This measure is computed as a mean of deviations from mean or the median. All deviations are treated as positive regardless of sign. Theoretically, there is an advantage in taking the deviations from median, because, the sum of absolute deviations from median is minimum. However, in actual practice, the arithmetic mean is more popular. The relative measure corresponding to the average deviation, called coefficient of average deviation is obtained by dividing average deviation by the particular average used in computing the average deviation. (Mean or median) p51
Advantages and disadvantages (of Average Deviation)

Though a good measure of variability, its use is limited, If only to measure and compare variability among several sets of data, the AD may be used.
Major disadvantage is its lack of mathematical properties. This is more so because non-use of signs in its calculations make it algebraically inconsistent.
Standard Deviation
Most widely used and important measure of variation. (In computing average deviation , the signs are ignored). The std deviation overcomes this problem, by squaring the deviations, which makes them all positive. The std deviation, also known as root mean square deviation. The square of Std Deviation is called variance
The Std Deviation and variance becomes larger as the variability or spread within the data becomes greater. It is readily comparable with other Std deviations, and greater the Std Deviation, greater the variability. The Std deviation is commonly used to measure variability, While other measures have special uses, It is the only measure possessing the necessary mathematical properties to make it useful for advanced statistical work. p53
Coefficient of Variation (C.V)

Frequently used relative measure of variation . This measure is simply the ratio of std deviation to mean expressed as percentage.
p54
Skewness
The measure of central tendency and variation do not reveal all characteristics of a given set of data Two distributions having same mean and Std deviation, may differ widely in the shape of their distribution.
Distribution of data is symmetrical or not (asymmetrical or skewed) Thus the skewness refers to lack of symmetry in distribution
Method of detection of skewness is to consider the tail of distribution

Symmetrical distribution:
No extreme values in a particular direction, so that low and high values balance each other.
Mean=median=mode
Negatively skewed distribution

Longer tail towards lower value, or left hand side, the skewness is negative. The mean is decreased by some extremely low values.
Positively skewed Distribution

Longer tail of distribution towards higher values, or right hand side, the skewness is positive. The mean is increased by some unusually high values. p55
Relative skewness
In order to make comparisons between the skewness in two or more distributions, the coefficient of skewness
(Karl Pearson method, Bowleys methods )
In practice the value of coefficient of Skewness , SK may be between +-1

Business Statistics

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Business Statistics

Загружено:

Авторское право:

Доступные форматы

Business Statistics

Quantitative Decision Making

Intuition alone has no place in decision making

Statistical methods used in Marketing, Finance, Production and personnel

QT: A group of statistical , and OR (programming) Techniques

Statistics and different senses

Functions of Statistical Methods

Statistics: Characteristics of Data: Common to refer data in quantitative form as Data.

Types of Statistical data

OR : a mathematical model to represent the situation under study.

Classification of Statistical Methods into three categories

Statistical decision Theory

Statistical Decision Theory; 4 different states of decision environment

Based on Degree of Abstraction

Based on Degree of certainty, and risk

Based on Specified behavior characteristics

Based on Procedure (method) of solution

Classification of models help in understanding the nature and role of models

Various Statistical Techniques

Measure of Central tendency

Frequency distribution in the shape of a peak: measure called: Kurtosis

Regression analysis: determining casual relationship between two variables

Time Series Analysis

Sampling and Statistical Inference

Non-random sampling schemes

Application of techniques in Business and Management

Economics Research and Development Natural science

Finance, accounting and Investments

Research and Development

Functions and Progressions

Constant and Parameters:

A single variable function can be linear or non-linear.

Sequence and Series

Arithmetic progression (AP)

Geometric progression (GP)

Concept of Maxima and Minima with managerial applications

Data Collection and analysis

Classification and presentation of collected data

Treatment of data through central tendency measurements,

Primary and Secondary Data

Methods of collecting Primary Data

Important points in Designing a questionnaire

Editing Primary Data to ensure:

Sources of secondary data

Precautions in use of secondary Data

Census (complete enumeration) and Sample

Presentation of data can be displayed either in tabular form or through charts

Principal objectives of classification

Some Common Types of Classification

Manifold Classification : divided into several classes (educational level)

Construction of a Discrete Frequency distribution

Construction of a Continuous Frequency distribution

Types of class interval: Methods

Guidelines for choosing the class

Cumulative and Relative frequencies

Important advantages in looking at Relative frequencies (percentages)

Popular Methods of Charting frequency distribution

Ogives or Cumulative frequency Curve

More than ogive.

The Concept of Level of Measurements

Parametric and non-parametric methods (assumptions about parameters of the data)

Measures of central Tendency