Вы находитесь на странице: 1из 19

A PowerPoint Presentation

Package to Accompany

Applied Statistics in
Business & Economics,
3rd edition
David P. Doane and Lori E. Seward
Prepared by Lloyd R. Jaisingh

McGraw-Hill/Irwin Copyright © 2011 by the McGraw-Hill Companies, Inc. All rights reserved.
Chapter 4
Descriptive Statistics
Chapter Contents

4.1 Numerical Description


4.2 Central Tendency
4.3 Dispersion
4.4 Standardized Data
4.5 Percentiles, Quartiles, and Box Plots
4.6 Correlation and Covariance
4.7 Grouped Data
4.8 Skewness and Kurtosis

4-2
4-2
Chapter 4
Descriptive Statistics
Chapter Learning Objectives (LO’s)

LO1: Explain the concepts of central tendency, dispersion, and shape.


LO2: Use Excel to obtain descriptive statistics and visual displays.
LO3: Calculate and interpret common measures of central tendency.
LO4: Calculate and interpret common measures of dispersion.
LO5: Transform a data set into standardized values.
LO6: Apply the Empirical Rule and recognize outliers.
LO7: Calculate quartiles and other percentiles.
LO8: Make and interpret box plots.
LO9: Calculate and interpret a correlation coefficient and covariance.
LO10: Calculate the mean and standard deviation from grouped data.
LO11: Explain the concepts of skewness and kurtosis.

4-3
4-3
Chapter 4
LO1 4.1 Numerical Description
LO1: Explain the concepts of central tendency, dispersion, and shape.

Three key characteristics of numerical data:

4-4
4-4
Chapter 4
LO2 4.1 Numerical Description
LO2: Use Excel to obtain descriptive statistics and visual displays.

EXCEL Displays for Tables 4.2 and 4.3

4-5
4-5
Chapter 4
LO3 4.2 Central Tendency
LO3: Calculate and interpret common measures of central tendency.

4-6
4-6
Chapter 4
LO1 4.2 Central Tendency
LO1: Explain the concepts of central tendency, dispersion, and shape.

Shape

• Compare mean and median or look at histogram to determine degree of skewness.

4-7
4-7
Chapter 4
LO4 4.3 Dispersion
LO4: Calculate and interpret common measures of dispersion.
• Variation is the “spread” of data points about the center of the distribution in a sample.
Consider the following measures of dispersion:

4-8
4-8
Chapter 4
LO5 4.4 Standardized Data
Chebyshev’s Theorem

• For any population with mean µ and standard deviation σ , the percentage of observations
that lie within k standard deviations of the mean must be at least 100[1 – 1/k2].
• For k = 2 standard deviations, 100[1 – 1/22] = 75%. So, at least 75.0% will lie within µ +
2σ .
• For k = 3 standard deviations,
100[1 – 1/32] = 88.9%
• So, at least 88.9% will lie within µ + 3σ
• Although applicable to any data set, these limits tend to be too wide to be useful.

4-9
4-9
Chapter 4
LO6 4.4 Standardized Data
LO6: Apply the Empirical Rule and recognize outliers

Unusual observations
are those that lie
beyond µ + 2σ .

Outliers are
observations
that lie beyond µ +
3σ . The Empirical Rule

4-10
4-10
Chapter 4
LO5 4.4 Standardized Data
LO5: Transform a data set into standardized values.

Defining a Standardized Variable


• A standardized variable (Z) redefines each observation in terms the number of
standard deviations from the mean.
A negative z

Standardization formula for a xi − µ value means the


population: zi = observation is
σ below the mean.

xi − x
Standardization formula for a
sample:
zi = Positive z means
s the observation is
above the mean.

4-11
4-11
Chapter 4
LO7 4.5 Percentiles, Quartiles, and Box-Plots

LO7: Calculate quartiles and other percentiles

Percentiles
• Percentiles are data that have been divided into 100 groups.
• For example, you score in the 83rd percentile on a standardized test. That means
that 83% of the test-takers scored below you.
• Deciles are data that have been divided into 10 groups.
• Quintiles are data that have been divided into 5 groups.
• Quartiles are data that have been divided into 4 groups.

4-12
4-12
Chapter 4
LO8 4.5 Percentiles, Quartiles, and Box-Plots

LO8: Make and interpret box plots.

• A useful tool of exploratory data analysis (EDA).


• Also called a box-and-whisker plot.
• Based on a five-number summary: Xmin , Q1, Q2, Q3, Xmax

• A box plot shows central tendency,


tendency dispersion,
dispersion and shape.

Fences and Unusual Data Values

Values outside the inner fences are


unusual while those outside the outer
fences are outliers

4-13
4-13
Chapter 4
LO8 4.5 Percentiles, Quartiles, and Box-Plots

Box Plots: Midhinge


• The average of the first and third quartiles.

Q1 +
Q3
Midhinge =
2

• The name “midhinge”


midhinge derives from the idea that, if the “box” were folded in half, it
would resemble a “hinge”.

4-14
4-14
Chapter 4
LO9 4.6 Correlation and Covariance

LO9: Calculate and interpret a correlation coefficient and covariance.


Correlation Coefficient
• The sample correlation coefficient r is a statistic that describes the degree of linearity
between paired observations on two quantitative variables X and Y. Note: -1 ≤ r ≤ +1.

The covariance of two


random variables X and Y
(denoted σXY ) measures the
degree to which the values of
X and Y change together.
Population Sample 4-15
4-15
Chapter 4
LO9 4.6 Correlation and Covariance

LO9: Calculate and interpret a correlation coefficient and covariance.

Covariance

A correlation coefficient is the


covariance divided by the
product of the standard
deviations of X and Y.

4-16
4-16
Chapter 4
LO10 4.7 Grouped Data

LO10: Calculate the mean and standard deviation from grouped data.

Group Mean and Standard Deviation

4-17
4-17
Chapter 4
LO11 4.8 Skewness and Kurtosis

LO11: Explain the concepts of skewness and kurtosis.

Skewness

4-18
4-18
Chapter 4
LO11 4.8 Skewness and Kurtosis

LO11: Explain the concepts of skewness and kurtosis.

Kurtosis

4-19
4-19

Вам также может понравиться