Академический Документы
Профессиональный Документы
Культура Документы
Regression Analysis
Ch t 11
Chapter
Learning Objectives
1 Describe the linear regression model
1.
2. State the regression modeling steps
3. Explain the least squares method
4. Compute regression coefficients
5. Predict the response variable
6. Interpret computer output
1/8/2016
Curve Fitting
To establish a relationship which make it possible to
predict one or more variables in terms of others.
others
This problem of predicting the avg. value of one variable
in terms of known values of another variable is called the
Problem of Regression
Curve Fitting
IIn procedure
d
off curve fitting
fitti we face
f
three
th
kinds
ki d off problems:
bl
we must decide what kind of curve ( eq.) we want to use.
We must find particular eq. which is best in some sense
Investigate certain questions regarding merits of that
equation and prediction made from it.
1/8/2016
Models
1. Representation
p
of some p
phenomenon
2. Mathematical model is a mathematical
expression of some phenomenon
3. Often describe relationships between
variables
4 Types
4.
Deterministic models
Probabilistic models
1/8/2016
Deterministic Models
1. Conjecture
j
exact relationships
p
2. No prediction error
3. Example: Force is exactly
mass times acceleration
F = ma
1/8/2016
Probabilistic Models
1. Conjecture
j
2 components
p
Deterministic
Random error
Y = deterministic component + random error
Types of
Probabilistic Models
Probabilistic
Models
Regression
Models
1/8/2016
Correlation
Models
Other
Models
13
Regression Models
1. Answer What is the relationship
p between
the variables?
2. Equation used
1 Numerical dependent (response) variable
What is to be predicted
1 or more numerical or categorical
independent (explanatory) variables
16
4. Evaluate model
5. Use model for prediction & estimation
1/8/2016
17
Model Specification
Is Based on Theory
1.
1
2.
3.
4.
1/8/2016
21
Thinking Challenge:
Which Is More Logical?
Sales
Sales
Advertising
Sales
Advertising
Sales
Advertising
1/8/2016
Advertising
R. Ali | Regression Analysis
22
Types of
Regression Models
1 Explanatory
Variable
Regression
Models
Multiple
Simple
1/8/2016
2+ Explanatory
Variables
26
Types of
Regression Models
1 Explanatory
Variable
Regression
Models
2+ Explanatory
Variables
Multiple
Simple
Linear
1/8/2016
NonLinear
Linear
NonLinear
30
Linear Equations
Y
Y = mX + b
m = Slope
Change
in Y
Change in X
b=Y
Y-intercept
intercept
X
High school knowledge
1/8/2016
38
Population
slope
Independent
(explanatory)
variable
Yi = 0 + 1X i + i
Dependent
(response)
variable
1/8/2016
Random
error
R. Ali | Regression Analysis
45
$
$
$
$
$
$
1/8/2016
47
Population
Yi = $ 0 + $ 1X i + $ i
Unknown
Relationship
Yi = 0 + 1X i + i
1/8/2016
$
$
$
R. Ali | Regression Analysis
$
$
50
Population Linear
Regression Model
Y
Observed
value
X
Observed value
1/8/2016
53
Population Linear
Regression Model
Y
Observed
value
E (Y ) = 0 + 1 X i
X
Observed value
1/8/2016
55
Population Linear
Regression Model
Yi = 0 + 1X i + i
Observed
value
i = Random error
E (Y ) = 0 + 1 X i
X
Observed value
1/8/2016
57
Sample Linear
Regression Model
Y
Yi = $ 0 + $ 1X i + $ i
^i = Random
error
Y$i = $ 0 + $ 1X i
Unsampled
observation
b
ti
X
Observed value
1/8/2016
61
Estimating Parameters:
Least Squares Method
Scattergram
1. Plot of all ((Xi, Yi) p
pairs
2. Suggests how well model will fit
60
40
20
0
0
1/8/2016
20
40
X
60
65
Thinking Challenge
How would you draw a line through the
points?
i t ? How
H
do
d you determine
d t
i which
hi h line
li
fits best?
60
40
20
0
1/8/2016
20
40
X
60
66
Thinking Challenge
How would you draw a line through the
points?
i t ? How
H
do
d you determine
d t
i which
hi h line
li
fits best?
60
40
20
0
1/8/2016
20
40
X
60
67
Thinking Challenge
How would you draw a line through the
points?
i t ? How
H
do
d you determine
d t
i which
hi h line
li
fits best?
60
40
20
0
1/8/2016
20
40
X
60
68
Thinking Challenge
How would you draw a line through the
points?
i t ? How
H
do
d you determine
d t
i which
hi h line
li
fits best?
60
40
20
0
1/8/2016
20
40
X
60
69
Thinking Challenge
How would you draw a line through the
points?
i t ? How
H
do
d you determine
d t
i which
hi h line
li
fits best?
60
40
20
0
1/8/2016
20
40
X
60
70
Thinking Challenge
How would you draw a line through the
points?
i t ? How
H
do
d you determine
d t
i which
hi h line
li
fits best?
60
40
20
0
1/8/2016
20
40
X
60
71
(Y Y ) =
n
i =1
1/8/2016
2
i
i =1
74
(Y
n
i =1
Yi
i2
i =1
75
LS minimizes
minimizes
$ i2 =
$ 12 + $ 22 + $ 23 + $ 24
i =1
Y2 = $ 0 + $ 1X 2 + $ 2
^4
^2
^1
^3
Y$i = $ 0 + $ 1X i
X
1/8/2016
83
Coefficient Equations
Prediction Equation
Y$i = $ 0 + $ 1X i
(
X )( Y )
XY
n
1 = i =1
i =1
X
i =1
(
X )
Sample YY-intercept
1/8/2016
i =1
$ 0 = Y $ 1X
2
i
i =1
Sample Slope
R. Ali | Regression Analysis
87
Computation Table
Xi
Yi
Xi2
Yi2
XiYi
X1
Y1
X1 2
Y1 2
X1Y1
X2
Y2
X2 2
Y2 2
X2Y2
Xn
Yn
Xn2
Yn2
XnYn
Xi
Yi
Xi2
Yi2
XiYi
1/8/2016
88
Interpretation of Coefficients
^
1. Slope
p ( 1)
^
Estimated Y changes by 1 for each 1 unit
increase in X
^
If 1 = 2, then Sales (Y) is expected to increase
by 2 for each 1 unit increase in Advertising (X)
2. Y-Intercept
p ( 0)
^
Average value of Y when X = 0
^
If 0 = 4, then average Sales (Y) is expected to
be 4 when Advertising (X) is 0
1/8/2016
89
90
Sales
4
3
2
1
0
0
Advertising
1/8/2016
91
Parameter Estimation
Solution Table
Xi
Yi
Xi2
Yi2
XiYi
16
25
16
20
15
10
55
26
37
1/8/2016
92
Parameter Estimation
Solution
(
X )(Y )
XY
n
1 = i=1
i=1
i i
i=1
(
X )
X
i=1
1/8/2016
2
i
i=1
37
(15)(10)
5 = 0.70
(15)2
55
5
93
Parameter Estimation
Solution
(
X )( Y )
XY
n
1 = i =1
i =1
i =1
(
X )
X
i =1
2
i
37
i =1
(15)(10)
5
= 0.70
(15)2
55
5
0 = Y 1 X = 2 (0.70)(3) = 0.10
1/8/2016
94
Coefficient Interpretation
Solution
^
1. Slope
p ( 1)
Sales Volume (Y) is expected to increase
by .7 units for each $1 increase in
Advertising (X)
1/8/2016
96
Coefficient Interpretation
Solution
1. Slope
p (^1)
Sales Volume (Y) is expected to increase
by .7 units for each $1 increase in
Advertising (X)
^
2. Y-Intercept (0)
Average
e age value
a ue o
of Sa
Sales
es Volume
o u e ((Y)) is
s
-.10 units when Advertising (X) is 0
Difficult to explain to Marketing Manager
Expect some sales without Advertising
1/8/2016
97
Parameter Estimation
Thinking Challenge
You re an economist for the county
Youre
cooperative. You gather the following data:
Fertilizer (lb.) Yield (lb.)
4
3.0
6
5.5
10
6.5
12
9.0
What is the relationship
between fertilizer & crop yield?
1/8/2016
100
Scattergram
Crop Yield vs. Fertilizer*
Yield (lb.)
10
8
6
4
2
0
10
15
Fertilizer (lb.)
1/8/2016
101
Parameter Estimation
Solution Table*
1/8/2016
Xi
Yi
Xi2
Yi 2
Xi Yi
3.0
16
9.00
12
5.5
36
30.25
33
10
6.5
100
42.25
65
12
90
9.0
144
81 00
81.00
108
32
24.0
296
162.50
218
102
Parameter Estimation
Solution*
(
X )( Y )
XY
n
1 = i =1
i =1
i =1
(
X )
X
i =1
2
i
i =1
218
(32)(24)
4
= 0.65
(32)2
296
4
0 = Y 1 X = 6 (0.65)(8) = 0.80
1/8/2016
103
Coefficient Interpretation
Solution*
1 Slope (^1)
1.
Crop Yield (Y) is expected to increase by
.65 lb. for each 1 lb. increase in Fertilizer (X)
^
2. Y-Intercept (0)
Average
g Crop
p Yield ((Y)) is expected
p
to be
0.8 lb. when no Fertilizer (X) is used
1/8/2016
106
Exercise 10.1
Why do we generally prefer a probabilistic model to a
deterministic model? Give examples for which the two
types of models might be appropriate.
Most variables do not have an exact relationship
If you are trying to determine how much will pay to rent a car, a
deterministic model would be appropriate. You would pay a fixed
amount plus so much per mile. Therefore # miles you drive
would determine how much you pay.
If you are trying to determine a persons weight base on his/her
height, a probabilistic model is appropriate. You can't determine
weight directly by height. There would be a deterministic
component
and a random
error.
1/8/2016
R. Ali | Regression Analysis
107
1/8/2016
108
Exercise 10.4
Columns 3 and 4 are for the preliminary computations to find
given p
pairs of x and y values. After the LS
the LS line for the g
line has been obtained, columns5,6, and 7 are used to compare
the observed and predicted values of y and to calculate the
SSE
a. Complete col. 3 & 4 of the table . Calculate the totals for columns 1-4.
b. Find SSxy.
c.
Find SSxx.
Find 1.
Find . x
d.
e.
and
f.
Find 0.
g. Find the least squares line and write it at the top of column 5.
h. Complete columns 5,6, and 7 of the table.
1/8/2016
109
Exercise 10.4
Xi
Yi
1/8/2016
2
i
x iy i
x y
y=
(Y-y)
(Y-y)2
( y y ) ( y y )
2
110
Measures of Variation
in Regression
1. Total sum of squares
q
((SSyy)
Measures variation of observed Yi
around the meanY
137
Variation Measures
Y
Yi
Y
Xi
1/8/2016
X
140
Variation Measures
Y
Yi
Total sum
of squares
(Yi -Y)2
Y
X
Xi
1/8/2016
141
Variation Measures
Y
Yi
Total sum
of squares
(Yi -Y)2
Y$i = $ 0 + $ 1X i
Y
Xi
1/8/2016
X
142
Variation Measures
Y
Yi
Unexplained sum
^ )2
of squares (Yi - Y
i
Total sum
of squares
(Yi -Y)2
Y$i = $ 0 + $ 1X i
Explained sum of
^
squares (Yi -Y)2
Xi
1/8/2016
144
Coefficient of Determination
1. Proportion
p
of variation explained
p
by
y
relationship between X & Y
0 r2 1
r2 =
Explained Variation
Total Variation
2
(Yi Y ) (Yi Y )
n
= i =1
i =1
(Yi Y )
n
i =1
1/8/2016
148
Coefficient of
Determination Examples
Y
r2 = 1
r2 = 1
X
Y
X
Y
r2 = .8
8
r2 = 0
X
1/8/2016
X
153
Coefficient of
Determination Example
Youre a marketing analyst for Hasbro
Toys. You find ^0 = -0.1 & ^1 = 0.7.
Sales (Units)
Ad $
1
1
2
1
3
2
4
2
5
4
Interpret a coefficient of
determination of 0.8167.
1/8/2016
154
Correlation Models
Correlation Models
1. Answer How
How strong is the linear
relationship between 2 variables?
2. Coefficient of correlation used
Population coefficient denoted (rho)
Values range from -1 to +1
Measures degree of association
215
Sample Coefficient
of Correlation
1. Pearson p
product moment coefficient of
correlation, r
r = Coefficient of Determination
( X i X )(Yi Y )
n
i =1
2
(
)
(
)
X
X
Y
Y
i
i
i =1
i =1
n
1/8/2016
216
Coeff Correlation
Equivalently:
q
y
1/8/2016
217
Coefficient of Correlation
Values
N
No
Correlation
-1.0
-.5
1/8/2016
+.5
+1.0
220
Coefficient of Correlation
Values
N
No
Correlation
-1.0
-.5
+.5
+1.0
Increasing degree of
negative correlation
1/8/2016
221
Coefficient of Correlation
Values
Perfect
N
Negative
ti
Correlation
-1.0
N
No
Correlation
-.5
1/8/2016
+.5
+1.0
222
Coefficient of Correlation
Values
Perfect
N
Negative
ti
Correlation
-1.0
N
No
Correlation
-.5
+.5
+1.0
Increasing degree of
positive correlation
1/8/2016
223
Coefficient of Correlation
Values
Perfect
N
Negative
ti
Correlation
-1.0
Perfect
P iti
Positive
Correlation
N
No
Correlation
-.5
1/8/2016
+.5
+1.0
224
Coefficient of Correlation
Examples
Y
r=1
r = -1
1
X
Y
r = .89
89
X
Y
X
1/8/2016
r=0
X
229