Вы находитесь на странице: 1из 18

Basic Statistics: Making Sense of our Data

Session 1

Caring PeopleBuilding Businesses. Building Careers.


Global Reporting & Analytics
By the end of the session, analysts should be able to

explain the foundational principles and concepts that are common to all
statistical methods,
identify appropriate uses of the statistical methods, including their strengths
and limitations,
and apply each of the concepts to practical situations in order to make
appropriate conclusions

Caring PeopleBuilding Businesses. Building Careers.

Company Confidential
Agenda
I. Introduction What is Statistics?
A. Terms and Definitions
a. Types of Data
b. Population and Sample
II. Distributions & Summary Statistics
A. Describing the distribution:
a. Shape
1. Symmetry
2. Skewness
3. Modality
b. Center
1. Average\Mean
2. Median
3. Mode
c. Spread
1. Percentile, Deciles and Quartiles
2. Range
3. Inter-quartile Range
4. Standard Deviation

Caring PeopleBuilding Businesses. Building Careers.

Company Confidential
What is Statistics?

Statistics is the science of collecting,


describing and interpreting data.
Source: Elementary Statistics by Robert Russell Johnson

What types of data do you encounter


in your day to day work tasks?

Caring PeopleBuilding Businesses. Building Careers.

Company Confidential
Data Types
For the purposes of our training, we will consider three types of data:

Nominal variables (e.g., Role of Evaluator, Agent, TM etc.) represent named


categories. They are also called your categorical variables.

Ordinal variables represent rank-ordered categories. Some examples are the


scales of your survey questionnaire, tenure (30, 60, 90, ++)

Continuous variables are numeric in the usual sense. They allow us to


perform various arithmetic functions like sums and averages (e.g, SPE
Score, CSAT Average)

Caring PeopleBuilding Businesses. Building Careers.

Company Confidential
Data Type Conversion
The tenure (in months) of Makati Analysts:
44, 90, 80, 135, 21, 53, 29, 128, 47, 11, 15, 49, 66, 49, 21, 110, 23,
50, 48, 50, 47, 45

The data can be further categorized into:

Ordinal Nominal

less than 6 months Not tenured

6 months to less than a year Tenured

1 year to less than 2 years

2 years and more Legend

Caring PeopleBuilding Businesses. Building Careers.

Company Confidential
Where your data comes from
There are two possible sources of data.

Population is a collection of well-defined objects. One of the goals of


statistics is to learn about a population.
Ex. Citizens of the Philippines, All Stream Employees

A sample, on the other hand, is a subset of the population.


Ex. All Stream employees in the Philippines

Caring PeopleBuilding Businesses. Building Careers.

Company Confidential
Data Distribution

The pattern on how the data are distributed across the


measurement scale is called Distribution

Pattern

Caring PeopleBuilding Businesses. Building Careers.

Company Confidential
Describing a Distribution
We describe a distribution by its
1. Shape usually described by
Symmetry
Modality
Outliers
2. Center refers to the measure of the
middle or expected value of the
data set
Mean
Median
Mode
3. Spread also called variation,
denotes variability in a distribution
Percentile, Decile, Quartile
Range
Interquartile Range
Standard Deviation and Variance

Caring PeopleBuilding Businesses. Building Careers.

Company Confidential
Shape: Symmetry

Symmetric
Left and right side of the center are mirror images of each other.

Skewed
Skewed to the Right\Positively Skewed Long tail to the right
Skewed to the Left\Negatively Skewed Long tail to the left

Caring PeopleBuilding Businesses. Building Careers.

Company Confidential
Shape: Modality

Modality
Refers to the number of peaks in a dataset.
Mode is the most frequent value in a dataset

Caring PeopleBuilding Businesses. Building Careers.

Company Confidential
Outliers

Outliers
Observations that deviate markedly from the rest of the data
Could result from special causes; may indicate bad data

Caring PeopleBuilding Businesses. Building Careers.

Company Confidential
Shape of the CSAT Data

CSAT Scores:
80, 70, 83, 68, 100, 61, 67, 86, 89, 75, 75, 79, 40, 77, 53, 74, 86,
71, 60, 62, 64, 80, 73, 68, 69, 83, 72, 79, 71, 82, 86, 100, 100,
80, 88

Caring PeopleBuilding Businesses. Building Careers.

Company Confidential
Measures of Central Tendency
Mean, Median, Mode

Mean
When used without specification, mean refers to the arithmetic
average of a data set.

The mean is where the distribution would balance.

Caring PeopleBuilding Businesses. Building Careers.

Company Confidential
Measures of Central Tendency
Mean, Median, Mode

Median
The median is a different kind of average. It is the value that is greater
than or equal to half of the values in the data set. It is the middle point of
the data set.

Caring PeopleBuilding Businesses. Building Careers.

Company Confidential
Measures of Central Tendency
Mean, Median, Mode

The mode is the most frequently occurring value in a data set.

Mode

Caring PeopleBuilding Businesses. Building Careers.

Company Confidential
Measures of Central Tendency
Mean, Median, Mode

Mode
Some guiding principles:
If the mean > median, we have a positively-skewed distribution
If the mean < median, we have a negatively-skewed distribution
If the mean median, symmetrical distribution

The mean is the preferred measure when a distribution is symmetrical


and do not have outliers.

The median would be the preferred measure of central location in other


instances because it is unaffected by outliers.

Caring PeopleBuilding Businesses. Building Careers.

Company Confidential
Measures of Central Tendency
Mean, Median, Mode

Caring PeopleBuilding Businesses. Building Careers.

Company Confidential

Вам также может понравиться