Академический Документы
Профессиональный Документы
Культура Документы
Why statistics?
• Decision making is often based on
analysis of data.
• Statistics helps you to make sense of the
data by using tools that summarize,
present and analyze the data.
• Decision maker can also ascertain the
confidence in the decisions.
Examples
• How many newspapers should the vendor stock
to maximize revenue?
– Depends on the probability distribution of demand and
expected profit
• Are two or more market segments significantly
different?
– Hypothesis testing
• What proportion of people are happy with the
Sixth-pay commission report?
– Parameter estimation
Sample vs. Population
• Population is the entire group/collection of
individuals/objects/things that we want
information about.
• Sample is part of the population that we actually
examine to gather information.
• Example
– We wish to find the average dividend percentage of
all companies traded at NSE.
• All stocks traded at NSE comprises population
• 10% of the stocks selected for gathering information is the
sample
Subdivision within Statistics
15 - 30 8 0.190
30 – 45 6 0.143
45 – 60 14 0.333
60 or more 2 0.048
Total 42 1
Frequency distribution- histogram
16
14
12
10
Frequency
8
6
4
2
0
0–15 15 - 30 30 – 45 45 – 60 60 or more
Delay in Minutes
Two variable frequency distribution
-cross tabulation
delay in minutes 0-15 15-30 30-45 45-60 60 or more Total
Govt. 5 2 5 9 0 21
Private 7 6 1 5 2 21
Total 12 8 6 14 2 42
xi
Population mean
N
Mean – example
• Average delay in flight departure
observations 0 25 47
0 25 48
median is average of 4 26 48
21st and 22nd 5 27 50
observation 8 34 50
= (34+38)/2 10 38 50
12 40 53
= 36 12 40 55
13 42 56
13 44 56
15 45 67
20 45 95
Mode
• Mode is the highest occurring observation
– mode in the example is 0
• The greatest frequency can occur at two
or more different values.
• If the data have exactly two modes, the
data are bimodal.
• If the data have more than two modes, the
data are multimodal.
Percentiles and Quartiles
• Standard Deviation
= 21.585 minutes
• Coefficient of Variation =
= 21.584/32.2381 (100) = 66.95%
Skewness
Skewness
– Skewness characterizes the degree of
asymmetry of a distribution around its
mean
• Positively skewed
• Symmetric or unskewed
• Negatively skewed
Skewness
Negatively skewed
Skewness
Symmetric
Skewness
Positively Skewed
Skewness - measure
Skewness of a distribution is measured by
( X ) 3
1
N 3
For a given data set you may use
Kurtosis
• Kurtosis characterizes the relative
peakedness or flatness of a symmetric
distribution compared to the normal
distribution
Platykurtic (relatively flat)
Mesokurtic (normal)
Leptokurtic (relatively peaked)
Kurtosis
Platykurtic - flat distribution
Kurtosis
Mesokurtic - not too flat and not too peaked
Kurtosis
Leptokurtic - peaked distribution
Kurtosis - measure
• Kurtosis for a distribution is measured by
2 3
( X ) 4
where 2
N 4
For a given data set you may use
Association between two variables
Delay Passengers Delay Passengers Delay Passengers
53 65 56 51 50 68
40 61 42 50 0 72
46 53 25 57 38 74
0 65 13 57 55 68
22 45 40 54 45 73
5 58 8 54 15 63
44 68 27 65 48 68
12 65 67 57 0 55
12 56 48 62 10 45
25 50 4 50 50 71
13 70 45 61 56 64
50 73 0 59 26 60
45 63 34 63 47 61
23 56 95 49 20 48
Association between two variables
• Scatter plot
• Covariance
• Correlation Coefficient
Scatter Plot
• Scatter Plots are used to identify any
underlying relationships among pairs of
data sets.
• The plot consists of a scatter of points,
each point representing an observation.
Scatter Plot
Delay vs Passengers
100
90
80
70
60
Delay
50
40
30
20
10
0
0 10 20 30 40 50 60 70 80
Passengers
Covariance
• The covariance is a measure of the linear
association between two variables.
• Positive values indicate a positive
relationship.
• Negative values indicate a negative
relationship
Covariance
• If the data sets are samples, the covariance
is denoted by
( xi x )( yi y )
sxy = 20.42 in the
n 1 Airline
example
• If the data sets are populations, the
covariance is denoted by
( xi x )( yi y )
xy
N
Correlation Coefficient