Lecture 3 Regression Analysis

Prepared by Suwardo
Regression Analysis
Mathmatics and Statistics
INTRODUCTION TO CHAPTER 11
NOTION
SIMPLE LINEAR REGRESSION MODEL
Model
Estimating Model Parameters
Error and Coefficient of Determination
Prediction
REGRESSION WITH TRANSFORMED VARIABLES
MULTIPLE LINEAR REGRESSION ANALYSIS
Lecture 3
Prepared by Suwardo
Regression Analysis
Determine the random relationship between
Y (Dependent Variable, response, r.v.) and
X (Independent Variables, regressor, not r.v.)
on the base of nobservations (x
1
, y
1
),, (x
n
, y
n
)
The Model Parameters are estimated by
Least Squares Method (LSM).
From the Model we can get Predictions for Y,
or E(Y)
Use the Analysis of Variance (ANOVA) to test
about the parameters and the goodness of fit of
the model
Prepared by Suwardo
Regression Analysis
Prepared by Suwardo
Regression Analysis
Example 1:
Data on age (years) and price (US$100) of Nissan Zs
(from Asian Import Magazine)
Age 5 4 6 5 5 5 6 6 2 7 7
Price 85 103 70 82 89 98 66 95 169 70 48

Prepared by Suwardo
Regression Analysis
ESTIMATING MODEL PARAMETERS ESTIMATING MODEL PARAMETERS
Using the Least Squares Method,o, | are estimated
by a, b, and we get , the Estimated
Regression Line
bx a y + =

= = = =
= = =
= =
= |
.
|
\
|
|
.
|
\
|
=
= |
.
|
\
|
=
= =
= = =
n
1 i
i
n
1 i
i
n
1 i
i
n
1 i
i i xy
n
1 i
i
2
n
1 i
i
n
1 i
2
i xx
xx
xy
n
1 i
2
i i
2
i
n
1 i
i
y
n
1
y , y x
n
1
) y (x S
x
n
1
x , x
n
1
) (x S
where , x b y a and ,
S
S
b Then
min! ) bx a (y ) y (y SSE
Prepared by Suwardo
Regression Analysis

i x
i
y
i
x
i
2
x
i
y
i

1 5 85 25 425
2 4 103 16 412
3 6 70 36 420
4 5 82 25 410
5 5 89 25 445
6 5 98 25 490
7 6 66 36 396
8 6 95 36 570
9 2 169 4 338
10 7 70 49 490
11 7 48 49 336

Sum 58 975 326 4732

20.1818 /11 (58) 326
x
n
1
x S
2
2
n
1 i
i
n
1 i
2
i XX
= =
|
|
.
|
\
|
=

= =
408.9091 11 (58)(975)/ 4732
y x
n
1
y x S
n
1 i
i
n
1 i
i i
n
1 i
i XY
= =
|
.
|
\
|
|
.
|
\
|
=

= = =
b = S
xy
/S
xx
= 20.26
195.47
11) 20.26)(58/ ( (975/11)
x b y a
=
=
=
Estimated Regression Line
x 20.26 195.47 y =
Example 1 (Continued) :

Prepared by Suwardo
Regression Analysis
ERROR AND ESTIMATING s
ERROR AND ESTIMATING s
2 2
Sum of Squares of Errors (SSE)
Estimating o
2
by
This is an unbiased estimator for o
2
The smaller SSE the more successful is the
Linear Regression Model in explaining y
XX
2
xy
yy
n
1 i
2
i i
2
i
n
1 i
i
S
) (S
S ) x b a (y ) y (y SSE = = =

= =
2 n
SSE
S
2
=
Prepared by Suwardo
Regression Analysis

1 5 85 25 425
2 4 103 16 412
3 6 70 36 420
4 5 82 25 410
5 5 89 25 445
6 5 98 25 490
7 6 66 36 396
8 6 95 36 570
9 2 169 4 338
10 7 70 49 490
11 7 48 49 336

Sum 58 975 326 4732

94.16 83.9
114.42 130.5
73.90 15.2
94.16 147.9
94.16 26.6
94.16 14.7
73.90 62.4
73.90 445.2
154.95 197.5
53.64 267.7
53.64 31.8
i
y
2
i i
) y
y (
1423.5
x 20.26 195.47 y =
Table? Computations?
SSE? Estimate s
2
?
( )
1423.52
y y SSE
n
1 i
2
i i
=
=
Estimate
2
by
158.17
1423.52/9
2 n
SSE
S
2
=
=

=
i x
i
y
i
x
i
2
x
i
y
i

Prepared by Suwardo
Regression Analysis
COEFFICIENT OF DETERMINATION
COEFFICIENT OF DETERMINATION
Coefficient of Determination (R-Sq):
R
2
= r
2
= SSR/SST = 1 - (SSE/SST); (0 R
2
1)
The greater r
2
the more successful is the
Linear Model
2
i
n
1 i
2
i
n
1 i
i
2
n
1 i
i
) y y ( ) y (y ) y (y SST + = =

= = =

2
n
1 i
i
) y y ( SSR =
Total Sum of Squares (SST)

SST = SSE + SSR
Regression Sum of Squares:
Amount of Variation in yexplained by the Model
SSR = SST - SSE
Prepared by Suwardo
Regression Analysis

1 5 85 25 425
2 4 103 16 412
3 6 70 36 420
4 5 82 25 410
5 5 89 25 445
6 5 98 25 490
7 6 66 36 396
8 6 95 36 570
9 2 169 4 338
10 7 70 49 490
11 7 48 49 336
Sum 58 975 326 4732

94.16 83.9
114.42 130.5
73.90 15.2
94.16 147.9
94.16 26.6
94.16 14.7
73.90 62.4
73.90 445.2
154.95 197.5
53.64 267.7
53.64 31.8
i
y
2
i i
) y
y (
1423.5
7225
10609
4900
6724
7921
9604
4356
9025
28561
4900
2304
96129
y
i
2
i x
i
y
i
x
i
2
x
i
y
i

Table? Compute
SST and r
2
?
( )
9708.54
(975)2/11 - 96129
y
n
1
y
S SST
2 2
YY
=
=
=
=

r
2
= 1-SSE/SST
=1-1423.52/9708.54
= 0.8534
High Value !
SSE = S
YY
(S
XY
)
2
/S
XX
=9708.54-(-408.9091)
2
/20.1818=1423.52
SSE other way
Prepared by Suwardo
Regression Analysis
Confidence Interval on the Slope
A 100(1-)% CI for the parameter in the linear regression is

( )
xx n
S s t
b b
/ where
2 , 2 /
= A
A + < < A
o
|
Hypothesis Testing on the Significance of Regression (on
the Slope)
Test the hypothesis
H
0
: = 0 ( there is no relationship between x and Y)
H
1
: 0 (the straight-line model is adequate)
Test Statistic: T distribution.
Critical Region: |T | > t
/2, n-2
.
xx
S S
b
T
/
|
=
Prepared by Suwardo
Regression Analysis
x 20.26 195.47 y =
332 . 6
17 . 158 ; 1818 . 20
262 . 2 ; 05 . 0
9 , 025 . 0
2
9 , 025 . 0 2 , 2 /
=
|
|
.
|
\
|
= A
= =
= = =

xx
xx
n
S
S
t
S S
t t
o
o
Estimated Regression Line ,
A 95% Confidence Interval on the Slope:
592 . 26 928 . 13
592 . 26 ; 928 . 13
< <
= A + = A
|
b b
Prepared by Suwardo
Regression Analysis
PREDICTION OF E(Y
0
)=E(Y/x
0
)
E(Y
0
)= E(Y/x
0
)= o + |x
0
can be estimated by:
a + bx
0
The (1- o ) Confidence Interval for E(Y/x
0
)
2 n
SSE
S ,
S
) x (x
n
1
.S. t
, bx a y where , ] y , y [
2
xx
2
0
2 n ,
0 0 0 0
2
+ =
+ = +

Prepared by Suwardo
Regression Analysis
An estimate for the mean price of 3-year-old
Nissan Z:
x 20.26 195.47 y =
134.69 (3) 20.26 195.47 y = =
90% Confidence Interval

148.27] , [121.11 13.58] 134.69 , 13.58 [134.69
13.58 88) 1.833(7.40 (7.4088) t
20.1818
5.2727) (3
11
1
(12.58) t
S
) x (x
n
1
S t
158.17
9
1423.52
2 n
SSE
S
] 134.69 , [134.69
9 , 0.05
2
9 , 0.05
xx
2
0
2 n ,
2
2
= +
= = =
+ =
+ =
= =
=
+
) (
Prepared by Suwardo
Regression Analysis
PREDICTION OF Y
0
= Y(x
0
)
A value of Y
0
= Y(x
0
) can be estimated
by: a + bx
0
The (1- o ) Confidence Interval for Y
0
= Y(x
0
) :
2 n
SSE
S ,
S
) x x (
n
1
1 . S . t
, bx a y
where , ] y
, y
[
2
xx
2
0
2 n ,
0 0 0 0
2

=
+ + = A
+ = A + A
o
Prepared by Suwardo
Regression Analysis
An estimate for the price of 3-year-old Nissan Z:
x 20.26 195.47 y =
134.69 (3) 20.26 195.47 y = =
90% Confidence Interval

161.45] , [107.93 26.76] 134.69 , 26.76 [134.69
26.76 995) 1.833(14.5 (14.5995) t
20.1818
5.2727) (3
11
1
1 (12.58) t
S
) x (x
n
1
1 S t
158.17
9
1423.52
2 n
SSE
S
] 134.69 , [134.69
9 , 0.05
2
9 , 0.05
xx
2
0
2 n ,
2
2
= +
= = =
+ + =
+ + =
= =
=
+
) (
Prepared by Suwardo
Regression Analysis
TABLE: ANALYSIS OF VARIANCE
Source of Sum of Degrees of Mean Computed F
Variance Squares Freedom Square
Regression SSR 1 SSR= SSR/1 F=SSR/S
2
Error SSE n-2 S
2
=SSE/(n-2)
Total SST n-1
Test: H
0
: | = 0 (Y does not depend on x)
H
1
: | 0 (Linear Model is fitted)
Critical Region: F f
o
(1, n-2)
Prepared by Suwardo
Regression Analysis

Source of Sum of Degrees of Mean Computed F
Variance Squares freedom Square

Regression 8285.02 1 8285.02 52.380
Error 1423.52 9 158.17
Total 9708.54 10

Testing H
0
: | = 0, H
1
: | = 0
Taking the significance o = 1% = 0.01
10.56 52.380 F
10.56 (1,9) f 2) n (1, f
0.01
> =
= =
Decision : Reject H
0
, Accept H
1

The linear model is fitted

Prepared by Suwardo
Regression Analysis
The regression equation is price = 195 - 20.3 age

Predictor Coef SE Coef T P
Constant 195.47 15.24 12.83 0.000
age -20.261 2.800 -7.24 0.000

S = 12.58 R-Sq = 85.3%

Analysis of Variance

Source DF SS MS F P
Regression 1 8285.0 8285.0 52.38 0.000
Residual Error 9 1423.5 158.2
Total 10 9708.5

Statistical output from Minitab
Prepared by Suwardo
Regression Analysis
REGRESSION WITH
TRANSFORMED VARIABLES
By transformation on x or/and on y, certain Non-linear
Regression Functions become Linear Regression Function.
Then we can apply all the previous method and results.
x
1
x ,
y
1
y by , x b a y
x
b ax
y
1
b ax
x
y
) x ( h x by , x b a y ) x ( h b a y
) x ln( x ), y ln( y by , x b ) a ln( y x a y
) y ln( y by , x b ) a ln( y e a y
* * * *
* *
* * * * b
* * bx
= = + =
+
=
+
=
= + = + =
= = + = =
= + = =
Examples:
Prepared by Suwardo
Regression Analysis
MULTIPLE LINEAR
MULTIPLE LINEAR
REGRESSION ANALYSIS
REGRESSION ANALYSIS
Model: Y = |
0
+ |
1
x
1
++ |
k
x
k
+ c , c ~N(0,o
2
)
ANOVA Table : Accept the Model if F f
o
(k,n-k-1)
Some of
Variance
Sum of
Squares
Degree of
freedom
Mean
square
F
computed
Regression SSR k
MSE
MSR
F =
k
SSR
MSR=
Error SSE n (k + 1)
) 1 k ( n
SSE
MSE
+
=
Total SST n - 1
Prepared by Suwardo
Regression Analysis
SUMMARY OF CHAPTER 11
NOTION
SIMPLE LINEAR REGRESSION MODEL
Model
Estimating Model Parameters
Error and Coefficient of Determination
Prediction
REGRESSION WITH TRANSFORMED VARIABLES
MULTIPLE LINEAR REGRESSION ANALYSIS

Lecture 3 Regression Analysis

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Lecture 3 Regression Analysis

Загружено:

Авторское право:

Доступные форматы

Prepared by Suwardo

Total Sum of Squares (SST)

134.69 (3) 20.26 195.47 y = =

90% Confidence Interval

134.69 (3) 20.26 195.47 y = =

90% Confidence Interval

Вам также может понравиться