Академический Документы
Профессиональный Документы
Культура Документы
ECB-652
Note: Materials in these slides has be collected and adapted from different internet sources and books
Statistics
• Statistics
– The science of collecting, organizing, presenting,
analysing, and interpreting data to assist in making more
effective decisions
• Statistical analysis
– used to manipulate summarize, and investigate data, so
that useful decision-making information results.
A Taxonomy of Statistics
Statistical Methods
• Descriptive statistics
– Methods of organizing, summarizing, and presenting data
in an informative way
• Inferential statistics
– The methods used to determine something about a
population on the basis of a sample
– Population –The entire set of individuals or objects of
interest or the measurements obtained from all individuals
or objects of interest
– Sample – A portion, or part, of the population of interest
Population and Sample
Inferential statistics
• Estimation
– e.g., Estimate the
population mean weight
using the sample mean
weight
• Hypothesis testing
– e.g., Test the claim that
the population mean
weight is 70 kg
Inference is the process of drawing conclusions or making decisions
about a population based on sample results
Statistical data
The collection of data that are relevant to the problem
being studied is commonly the most difficult, expensive,
and time-consuming part of the entire research project.
Statistical data are usually obtained by counting or
measuring items.
Primary data are collected specifically for the analysis desired
Secondary data have already been compiled and are available for
statistical analysis
A variable is an item of interest that can take on many
different numerical values.
A constant has a fixed numerical value.
Data
Variables
Qualitative Quantitative
Amount of income
Gender, marital Brand of Pc, hair Children in a family,
tax paid, weight of
status color Students in a class
a student
Data in Economics
• Time Series
– Data collected of same variable for over a period of time
– e.g. GDP of India
• Cross-sectional Data
– Data collected for same variable for more than one unit at a
particular point of time
– e.g. GDP of India, Pakistan, USA in 2010
• Panel Data
– Data collected for more than one units over a period of time.
– E.g. GDP of India, Pakistan, USA since 1990 to 2010
Statistical Description of Data
Consider a data set of 26 children of ages 1-6 years. Then the frequency
distribution of variable ‘age’ can be tabulated as follows:
Age 1 2 3 4 5 6
Frequency 5 3 7 5 4 2
Grouped Frequency Distribution of Age:
Age Group 1-2 3-4 5-6
Frequency 8 12 6
Cumulative Frequency
Age 1 2 3 4 5 6
Frequency 5 3 7 5 4 2
Cumulative Frequency 5 8 15 20 24 26
• Collect data
– e.g., Survey
• Present data
– e.g., Tables and graphs
• Summarize data
– e.g., Sample mean =
X i
n
Data Presentation
1 15 (15/60)= 25.0 30
Number of Subjects
0.25
25
2 25 (25/60)= 41.7 20
0.333
15
3 20 (20/60)= 33.3
0.417 10
5
Total 60 1.00 100
0
1 2 3
Treatment Group
Data Presentation –Categorical Variable
• Pie Chart: Lists the categories and presents
the percent or count of individuals who fall in
each category.
Pie chart of subjects in
treatment group Treatment Frequency Proportion Percent
Group (%)
1 15 (15/60)= 25.0
25%
33% 0.25
1 2 25 (25/60)= 41.7
2 0.333
42% 3
3 20 (20/60)= 33.3
0.417
x + x + ... + xn x i
x= 1 2 = i =1
n n
Methods of Center Measurement
• Equation: Y = a + bx
– b is the gradient, slope or regression coefficient
– a is the intercept of the line at Y axis or regression
constant
– Y is a value for the outcome
– x is a value for the predictor
Covariance Vs Correlation
• A measure used to indicate the extent to which two random variables
change in tandem is known as covariance. A measure used to represent
how strongly two random variables are related known as correlation.
• Covariance is nothing but a measure of correlation. On the contrary,
correlation refers to the scaled form of covariance.
• The value of correlation takes place between -1 and +1. Conversely, the
value of covariance lies between -∞ and +∞.
• Covariance is affected by the change in scale, i.e. if all the value of one
variable is multiplied by a constant and all the value of another variable
are multiplied, by a similar or different constant, then the covariance is
changed. As against this, correlation is not influenced by the change in
scale.
• Correlation is dimensionless, i.e. it is a unit-free measure of the
relationship between variables. Unlike covariance, where the value is
obtained by the product of the units of the two variables.
Regression and Correlation
Basis for Comparison Correlation Regression
Correlation is a statistical
Regression describes how an
measure which determines
Meaning independent variable is numerically
co-relationship or association
related to the dependent variable.
of two variables.
To represent linear
To fit a best line and estimate one
Usage relationship between two
variable on the basis of another variable.
variables.
Dependent and
No difference Both variables are different.
Independent variables
Correlation coefficient Regression indicates the impact of a unit
Indicates indicates the extent to which change in the known variable (x) on the
two variables move together. estimated variable (y).
To find a numerical value To estimate values of random variable
Objective expressing the relationship on the basis of the values of fixed
between variables. variable