Академический Документы
Профессиональный Документы
Культура Документы
2
OVERVIEW
Correlation Analysis
Dependent and Independent Variables
The Correlation Coefficient
Testing the Significance of the Correlation Coefficien
t
Regression Analysis
Least Square Principle
Testing the Significance of the Slope
3
OVERVIEW
The Standard Error of Estimate
The Coefficient of Determination
ANOVA Table in Regression Analysis
4
CORRELATION ANALY
SIS
Correlation analysis refers to a group of techniques us
ed to measure the relationship between two variables.
Scatter diagram
Correlation coefficient
5
DEPENDENT AND INDEPEN
DENT VARIABLES
The Dependent Variable is the variable being predicted or estimat
ed.
6
CORRELATION ANALY
SIS
The sales manager of Copier Sales of A
merica has a large sales force througho
ut the United States and Canada and w
ants to determine whether there is a rel
ationship between the number of sales
calls made in a month and the number o
f copiers sold that month. The manager
selects a random sample of 15 represe
ntatives and determines the number of s
ales calls each representative made last
month and the number of copiers sold.
Determine if the number of sales calls a
nd copiers sold are correlated.
7
CORRELATION ANALY
SIS
To report the relationship between the two variables, the usual firs
t step is to plot the data in a scatter diagram.
We refer to number of sales calls as the independent variable and
the number of copiers sold as the dependent variable.
8
THE CORRELATION C
OEFFICIENT
The Coefficient of Correlation (r) is a measure of the strength of
the relationship between two variables.
The sample correlation coefficient is identified by the lowerca
se letter r.
It shows the direction and strength of the linear relationship.
It ranges from -1 up to and including +1.
A value near 0 indicates there is little linear relationship betw
een the variables.
A value near +1 indicates a direct or positive linear relationshi
p between the variables.
A value near -1 indicates an inverse or negative linear relatio
nship between the variables.
9
THE CORRELATION C
OEFFICIENT
10
THE CORRELATION C
OEFFICIENT
11
THE CORRELATION C
OEFFICIENT
Correlation Coefficient:
12
THE CORRELATION C
OEFFICIENT
Using the Copier Sales of Ameri
ca data, compute the correlatio
n coefficient.
13
THE CORRELATION C
OEFFICIENT
14
THE CORRELATION C
OEFFICIENT
15
TESTING THE SIGNIFICANCE
OF THE CORRELATION COEF
FICIENT
H0: = 0 (the correlation in the population is 0)
H1: 0 (the correlation in the population is not 0)
16
TESTING THE SIGNIFICANCE
OF THE CORRELATION COEF
FICIENT
Using the copier sales example, can we conclude that the correlat
ion in the population is different from 0? Use a 0.05 significance le
vel
17
TESTING THE SIGNIFICANCE
OF THE CORRELATION COEF
FICIENT
Step 3: Determine the appropriate test statistic.
We can use t-distribution as the test statistic.
18
TESTING THE SIGNIFICANCE
OF THE CORRELATION COEF
FICIENT
Step 5: Compute the value of t and make a decision.
19
TESTING THE SIGNIFICANCE
OF THE CORRELATION COEF
FICIENT
Step 6: Interpret the result.
The data indicate that there is a significant correlation between th
e number of sales calls and copiers sold. We can also observe t
hat the correlation coefficient is .865, which indicates a strong, po
sitive relationship. In other words, more sales calls are strongly r
elated to more copier sales. Please note that this statistical analy
sis does not provide any evidence of a causal relationship. Anoth
er type of study is needed to test that hypothesis.
20
REGRESSION ANALYSI
S
In regression analysis we use the independent variable (x) to esti
mate the dependent variable (y).
The relationship between the variables is linear.
The least squares criterion is used to determine the equation.
REGRESSION EQUATION
An equation that expresses the linear relationship between two va
riables.
21
REGRESSION ANALYSI
S
General Form of Linear Regression Equation:
where
is the estimated value of the y variable for a selected x value.
a is the y-intercept. It is the estimated value of y when x = 0.
b is the slope of the line, or the average change in for each chan
ge of one unit in the independent variable x.
x is any value of the independent variable that is selected.
22
REGRESSION ANALYSI
S
In regression analysis, our objective is to use the data to position
a line that best represents the relationship between the two variab
les.
23
LEAST SQUARES PRIN
CIPLE
The least squares principle is used to obtain a and b.
24
COMPUTING THE SLOPE OF
THE LINE AND THE Y-INTERC
EPT
25
REGRESSION EQUATI
ON
Recall the example involving Co
pier Sales of America. The sale
s manager gathered information
on the number of sales calls ma
de and the number of copiers s
old for a random sample of 15 s
ales representatives. Use the le
ast squares method to determin
e a linear equation to express th
e relationship between the two
variables.
What is the expected number of
copiers sold by a representative
who made 100 calls?
26
REGRESSION EQUATI
ON
Step 1: Find the slope (b) of the line.
27
REGRESSION EQUATI
ON
Hence the regression equation is
When x = 100,
28
TESTING THE SIGNIFICANCE
OF THE SLOPE
H0: = 0 (the slope of the linear model is 0)
H1: 0 (the slope of the linear model is not 0)
29
TESTING THE SIGNIFICANCE
OF THE SLOPE
Using the previous result of the copier sales example, assuming t
he standard error of the slope is 0.042. Can we conclude that the
slope of the regression line is more than zero at a 0.05 significanc
e level?
30
TESTING THE SIGNIFICANCE
OF THE SLOPE
Step 1: State the null and alternate hypotheses.
H0: 0
H1: > 0
31
TESTING THE SIGNIFICANCE
OF THE SLOPE
Step 4:: Formulate a decision rule.
df = 15 2 = 13
Reject H0 if t > 1.771
32
TESTING THE SIGNIFICANCE
OF THE SLOPE
Step 6: Interpret the result.
Based on the sample evidence, we can conclude that the slope of
the regression is more than zero. The independent variable, num
ber of sales call, is useful in estimating copier sales.
33
THE STANDARD ERRO
R OF ESTIMATE
The standard error of estimate measures the scatter, or dispersio
n, of the observed values around the line of regression for a given
value of x.
Formulas used to compute the standard error:
34
THE STANDARD ERRO
R OF ESTIMATE
Recall the example involving Copier Sales of America. The sales
manager determined the least squares regression equation is giv
en below.
35
Sales Calls () Copiers Sold ()
96 41 45.000 16.000
40 41 30.395 112.462
104 51 47.086 15.316
128 60 53.346 44.281
164 61 62.734 3.008
76 29 39.784 116.295
72 39 38.741 0.067
80 50 40.827 84.140
36 28 29.352 1.828
84 43 41.870 1.276
180 70 66.907 9.565
132 56 54.389 2.596
120 45 51.259 39.178
44 31 31.438 0.192
84 30 41.870 140.906
587.111
36
THE STANDARD ERRO
R OF ESTIMATE
The standard error of estimate is computed as:
37
COEFFICIENT OF DET
ERMINATION
The coefficient of determination (r2) is the proportion of the total v
ariation in the dependent variable (y) that is explained or account
ed for by the variation in the independent variable (x). It is the squ
are of the coefficient of correlation.
38
COEFFICIENT OF DET
ERMINATION
Determine the coefficient of determination for the Copier Sales of
America example.
r = 0.865,
the coefficient of determination - r2 = (0.865)2 = 0.748
39
ANOVA TABLE IN REG
RESSION ANALYSIS
Regression analysis is usually conducted using regression softwa
re and the output is as follow:
40
ANOVA TABLE IN REG
RESSION ANALYSIS
Regression Sum of Squares = SSR = = 1738.89
Residual or Error Sum of Squares = SSE = = 587.11
Total Sum of Squares = SS Total = = 2326.00
41