Definition of Statistics

1. A collection of quantitative data pertaining to
a subject or group. Examples are blood
pressure statistics etc.
2. The science that deals with the collection,
tabulation, analysis, interpretation, and
presentation of quantitative data
Descriptive Statistics
Descriptive statistics are methods for
organizing and summarizing data.
For example, tables or graphs are used to
organize data, and descriptive values such as
the average score are used to summarize
data.
A descriptive value for a population is called a
parameter and a descriptive value for a
sample is called a statistic.

Inferential Statistics
Inferential statistics are methods for using
sample data to make general conclusions
Because a sample is typically only a part of
the whole population, sample data provide
only limited information about the population.
As a result, sample statistics are generally
imperfect representatives of the
corresponding population parameters.

Typesof Statistics
Two types of statistics:
Descriptive Statistics:
Describes the characteristics of a product or
process using information collected on it
Inferential Statistics (Inductive):
Draws conclusions on unknown process
parameters based on information contained
in a sample.
Uses probability
Accounting
Public accounting firms use statistical sampling procedures
when conducting audits for their clients.
Finance
Financial analysts use a variety of statistical information,
including price-earnings ratios and dividend yields, to guide their
investment recommendations.
Marketing
Electronic point-of-sale scanners at retail checkout counters are
being used to collect data for a variety of marketing research
applications.

Production
A variety of statistical quality control charts are used to
monitor the output of a production process.
Economics
Economists use statistical information in making
forecasts about the future of the economy or some
aspect of it.

Data are whole set of numbers that represent the
facts and figures that are collected, summarized,
analyzed, and interpreted.
Data can be further classified as being qualitative or
quantitative.
The statistical analysis that is appropriate depends on
whether the data for the variable are qualitative or
quantitative.
In general, there are more alternatives for statistical
analysis when the data are quantitative.

Basic Vocabulary Terms
Qualitative data
Qualitative data are labels or names used to identify
an attribute of each element.
Qualitative data use either the nominal or ordinal
scale of measurement.
Qualitative data can be either numeric or nonnumeric.
The statistical analysis for qualitative data are rather
limited.

Quantitative data
Quantitative data indicate either how many or how
much.
Quantitative data that measure how many are
discrete.
Quantitative data that measure how much are
continuous because there is no separation
between the possible values for the data.
Quantitative data are always numeric.
Ordinary arithmetic operations are meaningful only
with quantitative data.

population and sample
The population is the set of all elements of interest in a
particular study.
A sample is a subset of the population.
A variable is a characteristic or condition that can
change or take on different values.
Types of variable :Qualitative and quantitative variable
Types of qualitative variable :discrete and continuous

population and sample
Populations and Samples

11
Population
Sample
12
Measuring variables
To establish relationships between variables,
researchers must observe the variables and
record their observations. This requires that
the variables be measured.
The process of measuring a variable requires
a set of categories called a scale of
measurement and a process that classifies
each individual into one category.

13
Types of Measurement scale
1. A nominal scale is an unordered set of categories
identified only by name. Nominal measurements
only permit you to determine whether two
individuals are the same or different.ex ,
Religion,color, Gender
2. An ordinal scale is an ordered set of categories.
Ordinal measurements tell you the direction of
difference between two individuals. Ex, Economic
status
14
Types of Measurement scale

3. An interval scale is an ordered series of equal-sized
categories. Interval measurements identify the
direction and magnitude of a difference. The
concept of zero point is not included. Temperature,
I.Q score
4. A ratio scale is an interval scale where a value of
zero indicates none of the variable. Ratio
measurements identify the direction and magnitude
of differences and allow ratio comparisons of
measurements. Zero point is the absence of the
characteristic. Ex, height

Organizing and Presenting
Data Graphically
Data in raw form are usually not easy to use for
decision making
Some type of organization is needed
Table
Graph
Techniques reviewed here:
Bar charts and pie charts
Pareto diagram
Ordered array
Stem-and-leaf display
Frequency distributions, histograms and polygons
Cumulative distributions and ogives
Contingency tables
Scatter diagrams
Tables and Charts for
Categorical Data
Categorical
Data
Graphing Data
Pie
Charts
Pareto
Diagram
Bar
Charts
Tabulating Data
Summary
Table
The Summary Table
Example: Current Investment Portfolio

Investment Amount Percentage
Type (in thousands \$) (%)

Stocks 46.5 42.27

Bonds 32.0 29.09

CD 15.5 14.09

Savings 16.0 14.55

Total 110.0 100.0
(Variables are
Categorical)
Summarize data by category
Bar and Pie Charts
Bar charts and Pie charts are often used
for qualitative data (categories or nominal
scale)

Height of bar or size of pie slice shows the
frequency or percentage for each
category
Bar Chart Example
Investor's Portfolio
0 10 20 30 40 50
Stocks
Bonds
CD
Savings
Amount in \$1000's
Investment Amount Percentage
Type (in thousands \$) (%)

Stocks 46.5 42.27

Bonds 32.0 29.09

CD 15.5 14.09

Savings 16.0 14.55

Total 110.0 100.0
Current Investment Portfolio
Pie Chart Example
Percentages
are rounded to
the nearest
percent
Current Investment Portfolio
Savings
15%
CD
14%
Bonds
29%
Stocks
42%
Investment Amount Percentage
Type (in thousands \$) (%)

Stocks 46.5 42.27

Bonds 32.0 29.09

CD 15.5 14.09

Savings 16.0 14.55

Total 110.0 100.0
Pareto Diagram
Used to portray categorical data (nominal scale)
A bar chart, where categories are shown in
descending order of frequency
A cumulative polygon is often shown in the
same graph
Used to separate the vital few from the trivial
many
Pareto Diagram Example
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
Stocks Bonds Savings CD
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Current Investment Portfolio
Tables and Charts for
Numerical Data
Numerical Data
Ordered Array
Stem-and-Leaf
Display
Histogram Polygon Ogive
Frequency Distributions
and
Cumulative Distributions
The Ordered Array
A sequence of data in rank order:
Shows range (min to max)
within the range
May help identify outliers (unusual observations)
If the data set is large, the ordered array is
less useful
Data in raw form (as collected):

24, 26, 24, 21, 27, 27, 30, 41, 32, 38

Data in ordered array from smallest to largest:

21, 24, 24, 26, 27, 27, 30, 32, 38, 41
(continued)
The Ordered Array
Stem-and-Leaf Diagram
A simple way to see distribution details in a
data set

METHOD: Separate the sorted data series
into leading digits (the stem) and
the trailing digits (the leaves)
Example
Here, use the 10s digit for the stem unit:
Data in ordered array:
21, 24, 24, 26, 27, 27, 30, 32, 38, 41
21 is shown as
38 is shown as
41 is shown as
Stem Leaf
2 1
3 8
4 1
Example
Completed stem-and-leaf diagram:
Stem Leaves
2 1 4 4 6 7 7
3 0 2 8
4 1
(continued)
Data in ordered array:
21, 24, 24, 26, 27, 27, 30, 32, 38, 41
Using other stem units
Using the 100s digit as the stem:
Round off the 10s digit to form the leaves

613 would become 6 1
776 would become 7 8
. . .
1224 becomes 12 2

Stem Leaf
Using other stem units
Using the 100s digit as the stem:
The completed stem-and-leaf display:

Stem Leaves
(continued)
6 1 3 6
7 2 2 5 8
8 3 4 6 6 9 9
9 1 3 3 6 8
10 3 5 6
11 4 7
12 2
Data:

613, 632, 658, 717,
722, 750, 776, 827,
841, 859, 863, 891,
894, 906, 928, 933,
955, 982, 1034,
1047,1056, 1140,
1169, 1224
What is a Frequency Distribution?
A frequency distribution is a list or a table
containing class groupings (ranges within which
the data fall) ...
and the corresponding frequencies with which
data fall within each grouping or category
Tabulating Numerical Data:
Frequency Distributions
Why Use a Frequency Distribution?
It is a way to summarize numerical data
It condenses the raw data into a more
useful form...
It allows for a quick visual interpretation of
the data
Class Intervals
and Class Boundaries
Each class grouping has the same width
Determine the width of each interval by
Usually at least 5 but no more than 15
groupings
Class boundaries never overlap
Round up the interval width to get desirable
endpoints
groupings class desired of number
range
interval of Width
Frequency Distribution Example
Example: A manufacturer of insulation randomly
selects 20 winter days and records the daily
high temperature
24, 35, 17, 21, 24, 37, 26, 46, 58, 30,
32, 13, 12, 38, 41, 43, 44, 27, 53, 27
Sort raw data in ascending order:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
Find range: 58 - 12 = 46
Select number of classes: 5 (usually between 5 and 15)
Compute class interval (width): 10 (46/5 then round up)
Determine class boundaries (limits): 10, 20, 30, 40, 50, 60
Compute class midpoints: 15, 25, 35, 45, 55
Count observations & assign to classes
Frequency Distribution Example
(continued)
Frequency Distribution Example

Class Frequency
10 but less than 20 3 .15 15
20 but less than 30 6 .30 30
30 but less than 40 5 .25 25
40 but less than 50 4 .20 20
50 but less than 60 2 .10 10
Total 20 1.00 100
Relative
Frequency
Percentage
Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
(continued)
Tabulating Numerical Data:
Cumulative Frequency
Class
10 but less than 20 3 15 3 15
20 but less than 30 6 30 9 45
30 but less than 40 5 25 14 70
40 but less than 50 4 20 18 90
50 but less than 60 2 10 20 100
Total 20 100
Percentage
Cumulative
Percentage
Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
Frequency
Cumulative
Frequency
Graphing Numerical Data:
The Histogram
A graph of the data in a frequency distribution
is called a histogram
The class boundaries (or class midpoints)
are shown on the horizontal axis
the vertical axis is either frequency, relative
frequency, or percentage
Bars of the appropriate heights are used to
represent the number of observations within
each class
Histogram: Daily High Temperature
0
1
2
3
4
5
6
7
5 15 25 35 45 55 65
Class Midpoints
Histogram Example
(No gaps
between
bars)
Class
10 but less than 20 15 3
20 but less than 30 25 6
30 but less than 40 35 5
40 but less than 50 45 4
50 but less than 60 55 2
Frequency
Class
Midpoint
Frequency Polygon: Daily High Temperature
Graphing Numerical Data:
The Frequency Polygon
Class Midpoints
Class
10 but less than 20 15 3
20 but less than 30 25 6
30 but less than 40 35 5
40 but less than 50 45 4
50 but less than 60 55 2
Frequency
Class
Midpoint
(In a percentage
polygon the vertical axis
would be defined to
show the percentage of
observations per class)
Graphing Cumulative Frequencies:
The Ogive (Cumulative % Polygon)
Ogive: Daily High Temperature
Class Boundaries (Not Midpoints)
Class
Less than 10 0 0
10 but less than 20 10 15
20 but less than 30 20 45
30 but less than 40 30 70
40 but less than 50 40 90
50 but less than 60 50 100
Cumulative
Percentage
Lower
class
boundary
10 20 30 40 50 60
Side-by-Side Chart Example
Sales by quarter for three sales territories:

0
10
20
30
40
50
60
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
East
West
North
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
East 20.4 27.4 59 20.4
West 30.6 38.6 34.6 31.6
North 45.9 46.9 45 43.9