Вы находитесь на странице: 1из 45

Statistic for

Business and
Economics
Lecture 1
Prof. Hilmar Castro
Fall 2020
Prof. H. Castro - QLU Panama 1
What is Business
Statistics?
Business Statistics is a
collection of procedures and
techniques that are used to
convert data into meaningful
information in a business
environment.

Prof. H. Castro - QLU Panama 2


STATISTICS
The branch of mathematics that transforms data into useful information for
decision makers
STATISTICS

DATA INFORMATION KNOWLEDGE

DESCRIPTIVE INFERENTIAL
STATISTICS STATISTICS

PROBABILISTIC MODEL
The information is processed to be used to adjust models to reality
Prof. H. Castro - QLU Panama 3
In today´s world

We are constantly being bombarded with information of data that “summarize reality ” and a lot
of conclusions made from that “data summarize”. For example:
• Political Polls: After asking 1,000 people we predict that Trump will win the EEUU elections
• Economic Predictions: “Global growth is projected at –4.9 percent in 2020, 1.9 percentage
points below the April 2020 World Economic Outlook (WEO) forecast. The COVID-19
pandemic has had a more negative impact on activity in the first half of 2020 than
anticipated, and the recovery is projected to be more gradual than previously forecast..”
• Medical News: Known as SARS-CoV-2, the virus has resulted in more than 27 million
infections and 883,000 deaths.
How can we make sense out of all this data?
How do we differentiate valid from defective claims?

Prof. H. Castro - QLU Panama 4


What is Statistics?
Statistics is a tool for creating new understanding from a set of numbers.
A way to get information from data
STATISTICS

DATA INFORMATION KNOWLEDGE

DESCRIPTIVE INFERENTIAL
STATISTICS STATISTICS

Facts, especially numerical Knowledge, communicated


facts, collected together for concerning some particular
reference or information. fact.
Prof. H. Castro - QLU Panama 5
What is Statistic?
A business student is anxious about their statistics course, since they’ve heard the course is difficult. The
professor provides last term’s final exam marks to the student.
What can be discerned from this list of numbers?

STATISTICS

DATA INFORMATION KNOWLEDGE

List of last DESCRIPTIVE INFERENTIAL


term’s marks. STATISTICS STATISTICS
95 New information about the Inference based in
89 statistics class. knowledge
70
65 • Class mark average, • Almost everybody approved Econ 205
78 • Proportion of class receiving an A • The average grade was B+
57 • Most frequent mark • Be Happy NO withdraw
:
Prof. H. Castro - QLU Panama 6
Decision Makers Use Statistics To:

* Present and describe business data and information properly

Why Study * Draw conclusions about large populations, using information


Statistics ? collected from samples smaller in number (Political surveys,
accounting audits, quality processes, etc)

* Make reliable forecasts about a business activity (Inflation,


sales, economic growth, etc)

* Improve business processes (Quality assurance, etc)

Prof. H. Castro - QLU Panama 7


Key Statistical Concepts
DATA
Facts and figures collected about a phenomenon or individual, stored, summarized and analyzed, for
interpretation and decision making.
ELEMENT
Phenomenon or individual of interest
VARIABLE
It is a feature associated with an element
Ex: Sales, Percentage of growth, earnings, Weight, Statistics class notes
OBSERVATIONS
They are the different values associated with the variables
NOTE: The values that a variable takes have no meaning in the statistical context unless it has been
described in detail how they have been measured or collected.
Prof. H. Castro - QLU Panama 8
Data Types and Measurement Levels
Quantitative Data Qualitative Data
Measurements whose values are Data whose measurement scale is
inherently numerical. inherently categorical.

Time-Series Data
A set of consecutive data values
observed at successive points in
time.

Cross-Sectional Data
A set of data values observed at a
fixed point in time.

Prof. H. Castro - QLU Panama 9


Data Types and Measurement Levels

Data

Categorical Interval / Ratio


(Qualitative) (Quantitative)

Ordinal Nominal Discrete Continuous


n Grades at QLU nWeight
n Marital Status n Number of child
(A,A-,B,C,D,F) nVoltage
n Political Party n Defective items

Eyes color n SAT Scores nIncome


Data elements can n
be rank-ordered on the
basis of some relationship Assigning codes to
among them, with the categories
assigned values indicating
this order.

Prof. H. Castro - QLU Panama 10


Example 1
1-55 The following information can be found in the Murphy Oil Corporation Annual Report to
Shareholders. For each variable, indicate the level of data measurement.
a. List of Principal Offices (e.g., El Dorado, Calgary, Houston)à Qualitative, Nominal
b. Income (in millions of dollars) from Continuing Operationsà Quantitative, Continuous
c. List of Principal Subsidiaries (e.g., Murphy Oil USA, Inc., Murphy Exploration & Production
Company)à Qualitative, Nominal
d. Number of branded retail outletsà Quantitative, Discrete
e. Petroleum products sold, in barrels per dayà Quantitative, Discrete
f. Major Exploration and Production Areas (e.g., Malaysia, Congo, Ecuador) àQualitative, Nominal
g. Capital Expenditures measured in millions of dollarsà Quantitative, Continuous

Prof. H. Castro - QLU Panama 11


Example 2
1-58 Recently the manager of the call center for a large Internet bank asked his
staff to collect data on a random sample of the bank’s customers. Data on the
following variables were collected and placed in a file called Bank Call Center :

A small portion of the data is as follows:

Prof. H. Castro - QLU Panama 12


Example 2 (cont) Quantitative,
Qualitative, Quantitative, Qualitative,
Qualitative, Nominal Continuous Continuous Nominal
Nominal

a. Would you classify these data as time-series or cross-sectional? Explain.


b. Which of the variables are quantitative and which are qualitative?
c. For each of the six variables, indicate the level of data measurement.

Prof. H. Castro - QLU Panama 13


Statistical Inference
Procedures that allow a decision maker to reach a conclusion about a set of data
based on a subset of that data.
POPULATION
All the items or individuals about which you want to draw a conclusion. Usually
very large or infinite.
§ All voters in a country
§ All sales receipts in December

SAMPLE
Portion of a population selected for analysis. Potentially very large, but less than
the population.
§ 1,000 voters selected at random for interview
§ Every 25th receipts selected for audit from December’s sales
Prof. H. Castro - QLU Panama 14
Key Statistical Concepts….

Survey
Census

If you could interview the population of Panama’s voters you would know who would be the
next Panama’s President.
But If you interview a sample of voters you would predict who will be.
WHY SAMPLE:
§ Less time and cost consuming
§ It is possible to obtain statistical results whit sufficiently accurate based on samples
Prof. H. Castro - QLU Panama 15
Key Statistical Concepts….
PARAMETER
Numerical measure that describes a characteristic of a population. Usually it is not possible to
know it
STATISTICS
Numerical measure that describes a characteristic of a sample. It is a know value

Parameter Statistics

Average age of voters in Panama Average age of interviewed voters in


Panama
Prof. H. Castro - QLU Panama 16
Parameter vs Statistic

Voting population of
Panama A sample of 5,000 voters is
(electoral roll) selected
2,713,698

The sample proportion is used to The percentage of votes in favor of


estimate the proportion of the Cortizo is counted from the
population selected sample.

814,109 votes for Cortizo 1,500 votes for Cortizo = 30%

(30% of voters = Parameter) (sample proportion = statistic)

Prof. H. Castro - QLU Panama 17


Descriptive Statistic
Methods of organizing, summarizing, and presenting data in a convenient and informative
way

§ Collecting: e.g. Survey, observation,


experiments
Sample

§ Presenting: Graphical Techniques


e.g. Tables and graphics

§ Describing data: Numerical Techniques Ex. Sample mean


Population Parameter: average mark: 7.9 B+ åX i

Sample Statistics: average mark 8.33 B+ n


Prof. H. Castro - QLU Panama 18
Inferential Statistics
Making statements about a population by examining sample results.
§ Estimating:
Inference Population Parameter
Sample Statistics (unknown, but can be estimated
(known) from sample evidence)

Population
Sample

§ Hypothesis Testing: Use sample information to test de veracity of the


estimations
Prof. H. Castro - QLU Panama 19
Inference Statistic
§ Estimating:
Ex: Estimate the population average mark using the sample
average mark
Population Parameter: average mark: 7.9 B+
Sample Statistics: average mark 8.33 B+
The estimate of the average mark in the last term is 8.33

§ Hypothesis Testing:
Use sample data to test the claim that the population average mark is B+
Based on that we can encourage the new student to NOT be afraid

Prof. H. Castro - QLU Panama 20


Examples

According to an article in the


According to the U.S. Bureau
Idaho Statesman , a poll
of Labor Statistics, the annual
taken the day before
percentage increase in U.S.
elections in Germany
college tuition and fees in
showed Chancellor Gerhard
2005 was 6.0%, in 2009 it
Schroeder behind his
was 4.0%, and in 2014 it was
challenger, Angela Merkel,
9.5%. Are these percentages
by 6 to 8 percentage points.
statistics or parameters?
Is this a statistic or a
Explain.
parameter? Explain.

Prof. H. Castro - QLU Panama 21


WHERE TO TAKE THE
DATA?
• Existing sources:
Primary, secondary
• Experiments
• Surveys

Procedures for Collecting


Data
Prof. H. Castro - QLU Panama 22
Procedures for
Collecting Data
HOW TO TAKE THE DATA?
Data acquisition mechanism:
Sampling, design of
experiments, etc. Error Control

HOW TO MAKE INFERENCES


OR PROVE HYPOTHESIS?
--> ETHICAL GUIDELINES

Prof. H. Castro - QLU Panama 23


Data Collection Methods
Primary Sources: Secondary Sources:
The data collector is the one using the The person performing data analysis
data for analysis is not the data collector

Census data
Experiments
Experimental Design Data printed or Data
published on internet
Telephone or Written Source
questionnaires and Survey
Survey Design

Direct observation and Personal


Interviews
Objectivity
Prof. H. Castro - QLU Panama 24
Survey Design Steps
§ Define the issue: What are the purpose and objective of the
survey
§ Define the population of interest
§ Develop survey questions
ü Make questions clear and unambiguous
ü Use universally-accepted definitions
ü Limit the number of questions
§ Pre-test the survey (Pilot)
ü Pilot test with a small group of participants
ü Assess clarity and length
§ Determine the sample size and sampling method
§ Select sample and administer the survey

Prof. H. Castro - QLU Panama 25


Be Carefull With …
• Define the issue
• Interviewer Bias
• Nonresponse Bias
• Selection Bias
• Observer Bias
• Measurements errors
• Internal Validity: factors intrinsic to the experiment
• External Validity: Can the experiments be
reproduced outside the original environment?

Prof. H. Castro - QLU Panama 26


Data Collection Methods - Examples
1-22. For each of the following 1-21. Suppose a survey is conducted using a
situations, indicate what type of data telephone survey method. The survey is conducted
collection method you would from 9 A.M. to 11 A.M. on Tuesday. Indicate what
recommend and discuss why you have potential problems the data collectors might
made that recommendation: encounter.
a. collecting data on the percentage of
bike riders who wear helmets 1-27. Assume you have used an online service such
b. collecting data on the price of regular as Orbitz or Travelocity to make an airline
unleaded gasoline at gas stations in your reservation. The following day you receive an e-mail
state containing a questionnaire asking you to rate the
c. collecting data on customer quality of the experience. Discuss both the
satisfaction with the service provided by advantages and disadvantages of using this form of
a major U.S. airline questionnaire delivery.

Prof. H. Castro - QLU Panama 27


Sampling Techniques

Population Parameter
Inference
Sample Statistics (unknown, but can be estimated
(known) from sample evidence)

Population
Sample

Sample is the portions of a population that has been selected for analysis.

Statistical sampling focus on collecting a small representative group of the


larger population.

Prof. H. Castro - QLU Panama 28


Sampling Techniques

Sample is the portions of a population that has been selected for analysis.
Statistical sampling focus on collecting a small representative group of the larger
population.

The sampling process begins by defining the frame. The frame is a listing of items
that make up the population. Frame is like the populations list. Samples are drawn
from frames.

Sampling
Sample Population
Frame
Sample

Prof. H. Castro - QLU Panama 29


Simple Random Sampling

§ It is used when there is no previous knowledge of how the population is


with respect to the characteristic that is studied and it is wanted to avoid
biases
§ It is expensive
§ Requires a very broad sampling frame
§ Random number generation algorithms
§ Difficult to implement
Like throw the dice

Prof. H. Castro - QLU Panama 30


Stratified Random Sampling
§ Divide population into subgroups usually with common characteristics (call
strata)
§ Select a simple random sample from each subgroup
§ Combine samples from subgroups into one
§ Difficult to implement

Prof. H. Castro - QLU Panama 31


Systematic Random Sampling
§ Decide a sample size
§ Divide population frame of N individuals into groups of k individuals: k=N/n
§ Randomly select one individual from the 1st group
§ Select de 2d individual from the 2d group
§ Select de kth individual from the kth group

N=64
n=8
n=8

Prof. H. Castro - QLU Panama 32


Cluster Sampling

§ Divide population into several “clusters” (groups), each “representative” of


the population (mini – populations)
§ Select a simple random sample of clusters
§ All items in the selected cluster can be used, or items can be chosen
from a cluster using another probability sampling technique

ü Differences between cluster &


Stratified

Prof. H. Castro - QLU Panama 33


Sampling Techniques
1-43. Give the name of the kind of sampling that was most likely used in each
of the following cases:
a) a Wall Street Journal poll of 2,000 people to determine the president’s
approval rating
b) a poll taken of each of the General Motors (GM) dealerships in Ohio in
December 2008 to determine an estimate of the average number of 2008-
model Chevrolets not yet sold by GM dealerships in the United States
c) a quality assurance procedure within a Frito-Lay manufacturing plant that
tests every 1,000th bag of Fritos Corn Chips produced to make sure the bag
is sealed properly.
d) a sampling technique in which a random sample from each of the tax
brackets is obtained by the Internal Revenue Service to audit tax returns
Prof. H. Castro - QLU Panama 34
Types Of Data –
Types Of Variable
INTERVAL DATA

• Real numbers, i.e. heights, weights, prices, etc.


• Also referred to as quantitative or numerical.
• Arithmetic operations can be performed on Interval
Data, thus its meaningful to talk about 2*Height, or
Price + $1, and so on.

Ø Discrete Data: gaps exist between possible values


e.g. # of children in a family
Ø Continuous Data: no gaps exist between possible
values
e.g. annual income of a family
Prof. H. Castro - QLU Panama 35
Types Of Data –
Types Of Variable
ORDINAL DATA
• The values of ordinal data are categories they have an
order; a ranking to them. Is very common assigned a
number to each category. E.g. Course rating system:
poor = 1, fair = 2, good = 3, very good = 4, excellent = 5

• While it’s still not meaningful to do arithmetic on this


data (e.g. does 2*fair = very good?!), we can say things
like:
excellent > poor or fair < very good
That is, order is maintained no matter what numeric
values are assigned to each category.
Only counts of the number of items in a category are
allowed.

36
NOMINAL DATA

The values of nominal data are


categories too but they don’t have an
order.
E.g. responses to questions
about marital status,
Single = 1, Married = 2,
Divorced = 3, Widowed = 4

Because the numbers are arbitrary,


arithmetic operations don’t make any
sense (e.g. does Widowed ÷ 2 =
Married?!)
Only counts of the number of items
Types Of Data – in a category are allowed.

Types Of Variable More examples: gender, religious


preference, etc.
Prof. H. Castro - QLU Panama 37
Types Of Data - Types Of Variable Yes
Discrete

Can you do Numerical Can you


Data Yes
math? Data count?

Continuous
No No

Yes
Ordinal Data
Categorical
Ordered?
Ordered?
Data
Nominal
Data
No
Prof. H. Castro - QLU Panama 38
Types Of Data - Time
The data collected may also be classified in:

Data

Time Series Cross Section Data

Prof. H. Castro - QLU Panama 39


Types Of Data - Variable

1-56. You have collected the following information on 15


different real estate investment trusts (REITs). Identify
whether the data are cross-sectional or time-series.
a) income distribution by region in 2008
b) per share (diluted) funds from operations (FFO) for the
years 2002 to 2008
c) number of properties owned as of December 31, 2008
d) the overall percentage of leased space for the 119
properties in service as of December 31, 2008
e) dividends per share for the years 2002–2008

Prof. H. Castro - QLU Panama 40


Also, the kind of value that a variable can take give us
an idea of the complexity of analysis that can be done

Data
Measurement
Levels

Prof. H. Castro - QLU Panama 41


Categorical Variables Categories

Nominal Level Of Personal Computer


Ownership
Yes / No

Measurement Type of Stocks Owned

Internet Provider
Growth Value Other

Microsoft Network / AOL

• Values are the arbitrary numbers that represent categories.


• Only calculations based on the frequencies of occurrence are valid.
• This data doesn't have an order
• There are very common
• We can generate nominal data when we assigned number to the categories

Prof. H. Castro - QLU Panama 42


Categorical Variable Ordered Categories

Student class designation Freshman, Sophomore, Junior,


Senior

Product satisfaction Satisfied, Neutral, Unsatisfied

Ordinal Level Faculty rank Professor Associate, Professor


Assistant, Professor Instructor

Standard & Poor’s bond AAA, AA, A, BBB, BB, B, CCC,


ratings CC, C, DDD, DD, D

Student Grades A, B, C, D, F

An ordinal scale classifies data into distinct categories in which ranking is implied
• Values must represent the ranked order of the data.
• Calculations based on an ordering process are valid.
• Data may be treated as nominal but not as interval.

Prof. H. Castro - QLU Panama 43


True zero point: No money is = 0 $ = 0 Pesos
If you don’t have money you have 0 money in whatever currencies you
can think

No zero point: QLU grades vs


0 ºF ≢ 0ºC , 0ºF = -17.59ºC Universidad de Panamá ??
Interval Level With an interval scale we can do sophisticated analysis but the comparisons
may be difficult because doesn’t respect de ratios:

Yesterday 35 ºF (1.67ºC) Today 70ºF (21.11ºC)


Is today twice as warm as yesterday?

An interval scale is an ordered scale in which the difference between measurements


is a meaningful quantity but the measurements do not have a true zero point. (When
cero means nothing in whatever scale we use)

Prof. H. Castro - QLU Panama 44


A ratio scale is an ordered scale in which the
difference between the measurements is a
meaningful quantity and the measurements have a
true zero point (preserve ratios).
Radio Level

Prof. H. Castro - QLU Panama 45

Вам также может понравиться