Ols

Outline
Least Squares Methods

Estimation: Least Squares
Interpretation of estimators
Properties of OLS estimators
Variance of Y, b, and a
Hypothesis Test of b and a
ANOVA table
Goodness-of-Fit and R2
(c) 2007 IUPUI SPEA K300 (4392)
Linear regression model
3
2
.5
1
Y = 2 +.5X
-1
(c) 2007 IUPUI SPEA K300 (4392)
Terminology
Dependent variable (DV) = response
variable = left-hand side (LHS) variable
Independent variables (IV) = explanatory
variables = right-hand side (RHS)
variables = regressor (excluding a or b0)
a (b0) is an estimator of parameter , 0
b (b1) is an estimator of parameter , 1
a and b are the intercept and slope
(c) 2007 IUPUI SPEA K300 (4392)
Least Squares Method
How to draw such a line based on data points

observed?
Suppose a imaginary line of y= a + bx
Imagine a vertical distance (or error) between
the line and a data point. E=Y-E(Y)
This error (or gap) is the deviation of the data
point from the imaginary line, regression line
What is the best values of a and b?
A and b that minimizes the sum of such errors
(deviations of individual data points from the
line)
(c) 2007 IUPUI SPEA K300 (4392)

4
x3
3
e3
y
2
E(Y)=a + bX
x1
e2
e1
x2
3
x
(c) 2007 IUPUI SPEA K300 (4392)

Deviation does not have good properties
for computation
Why do we use squares of deviation?
(e.g., variance)
Let us get a and b that can minimize the
sum of squared deviations rather than
the sum of deviations.
This method is called least squares
(c) 2007 IUPUI SPEA K300 (4392)
Least squares method minimizes the sum of

squares of errors (deviations of individual data
points form the regression line)
Such a and b are called least squares
estimators (estimators of parameters and ).
The process of getting parameter estimators
(e.g., a and b) is called estimation
Regress Y on X
Lest squares method is the estimation method
of ordinary least squares (OLS)
(c) 2007 IUPUI SPEA K300 (4392)
Ordinary Least Squares

Ordinary least squares (OLS) =
Linear regression model =
Classical linear regression model
Linear relationship between Y and Xs

Constant slopes (coefficients of Xs)
Least squares method
Xs are fixed; Y is conditional on Xs
Error is not related to Xs
Constant variance of errors
(c) 2007 IUPUI SPEA K300 (4392)
Least Squares Method 1

Y X
E (Y ) Y a bX
Y Y Y (a bX ) Y a bX
2 (Y Y ) 2 (Y a bX ) 2
(Y a bX ) 2 Y 2 a 2 b 2 X 2 2aY 2bXY 2abX

2
2
2
(
Y
Y
)
(
Y
bX
)

Min 2 Min(Y a bX )2
How to get a and b that can minimize the sum

of squares of errors?
(c) 2007 IUPUI SPEA K300 (4392)

Linear algebraic solution
Compute a and b so that partial derivatives

with respect to a and b are equal to zero
(Y a bX ) 2na 2
2
na Y b X 0
Y
X
a
b
n
Y bX
(c) 2007 IUPUI SPEA K300 (4392)
Y 2b X 0

Take a partial derivative with respect to b and
plug in a you got, a=Ybar b*Xbar
(Y a bX ) 2b
2
Y
X
b X XY
b
n
n
2 XY 2a X 0
b X 2 XY Y bX X 0
b X 2 XY a X 0
2
X 0
X 0
X Y
b X 2 XY b
n
n
2
n X 2 X 2 XY X Y
n
n
(c) 2007 IUPUI SPEA K300 (4392)

Least squares method is an algebraic solution
that minimizes the sum of squares of errors
(variance component of error)
b
a
n XY X Y
n X 2
( X X )(Y Y ) SP
SS
X
(X X )
Y b X
n
xy
Y bX
2
Y
X
X XY
n X 2 X
Not recommended
(c) 2007 IUPUI SPEA K300 (4392)
OLS: Example 10-5 (1)

No
x-xbar
y-ybar
(x-xb)(y-yb)
(x-xbar)^2
43
128
-14.5
-8.5
123.25
210.25
48
120
-9.5
-16.5
156.75
90.25
56
135
-1.5
-1.5
2.25
2.25
61
143
3.5
6.5
22.75
12.25
67
141
9.5
4.5
42.75
90.25
70
152
12.5
15.5
193.75
156.25
Mean
57.5
136.5
Sum
345
819
541.5
561.5
( X X )(Y Y ) SP
SS
( X X )
xy
541.5
.9644
561.5
a Y bX 136.5 .9644 57.5 81.0481

(c) 2007 IUPUI SPEA K300 (4392)
OLS: Example 10-5 (2), NO!

No
xy
x^2
43
128
5504
1849
48
120
5760
2304
56
135
7560
3136
61
143
8723
3721
67
141
9447
4489
70
152
10640
4900
Mean
57.5
136.5
Sum
345
819
47634
20399
n XY X Y
n X 2 X
6 47634 345 819

.964
6 20399 3452
Y X X XY 819 20399 345 47634

a
81.048
6 20399 345
n X X
2
(c) 2007 IUPUI SPEA K300 (4392)
OLS: Example 10-5 (3)
120
130
140
150
Y hat = 81.048 + .964X
40
50
60
x
Fitted values
(c) 2007 IUPUI SPEA K300 (4392)
70
What Are a and b ?

a is an estimator of its parameter
a is the intercept, a point of y where the
regression line meets the y axis
b is an estimator of its parameter
b is the slope of the regression line
b is constant regardless of values of Xs
b is more important than a since that is
what researchers want to know.
(c) 2007 IUPUI SPEA K300 (4392)
How to interpret b?
For unit increase in x, the expected
change in y is b, holding other things
(variables) constant.
For unit increase in x, we expect that y
increases by b, holding other things
(variables) constant.
For unit increase in x, we expect that y
increases by .964, holding other
variables constant.
(c) 2007 IUPUI SPEA K300 (4392)
Properties of OLS estimators
The outcome of least squares method is OLS

parameter estimators a and b.
OLS estimators are linear
OLS estimators are unbiased (precise)
OLS estimators are efficient (small variance)
Gauss-Markov Theorem: Among linear
unbiased estimators, least square estimator
(OLS estimator) has minimum variance.
BLUE (best linear unbiased estimator)
(c) 2007 IUPUI SPEA K300 (4392)
Hypothesis Test of a an b
How reliable are a and b we compute?

T-test (Wald test in general) can answer
The standardized effect size (effect size /
standard error)
Effect size is a-0 and b-0 assuming 0 is the
hypothesized value; H0: =0, H0: =0
Degrees of freedom is N-K, where K is the
number of regressors +1
How to compute standard error (deviation)?
(c) 2007 IUPUI SPEA K300 (4392)
Variance of b (1)
b is a random variable that changes across

samples.
b is a weighted sum of linear combinations of
random variable Y

( X X )(Y Y ) XY nXY ( X X )Y w Y
(
X
X
)
(
X
X
)
(
X
X
)
w Y
i i
wi
w1Y1 w2Y2 ... wnYn
(Xi X )
( X i X )2
( X X )(Y Y ) ( XY XY XY XY ) XY XY XY XY
XY Y X X Y nXY XY YnX XnY nXY XY nXY
(c) 2007 IUPUI SPEA K300 (4392)
i i
Variance of b (2)
Variance of Y (error) is 2
Var(kY) = k2Var(Y) = k22
b
( X X )(Y Y ) w Y
( X X )
i i
Var ( ) Var ( wiYi ) w12Var (Y1 ) w22Var (Y2 ) ... wn2Var (Yn )
w12 2 w22 2 ... wn2 2 2 wi2
wi
(Xi X )
( X i X )2
2
(Xi X )

2 wi2 2
2
( Xi X )
( X
(X
X)
2 2
(c) 2007 IUPUI SPEA K300 (4392)
X)
2
( X
X )2
Variance of a
a=Ybar + b*Xbar
Var(b)=2/SSx , SSx = (X-Xbar)2
Var(Y)=Var(Y1)+Var(Y2)++Var(Yn)=n2
Var( ) Var(Y bX ) Var(Y ) Var(bX ) 2Cov(Y , bX )
2
Y
1
2
2
X Var(b) 2 Var( Y ) X
Var
2
n
n
(Xi X )
2
2
1
X
2
2
2
n X
2
2
2
(X X )
n (X X )
n
i
i
Now, how do we compute the variance

of Y, 2?
(c) 2007 IUPUI SPEA K300 (4392)
Variance of Y or error
Variance of Y is based on residuals (errors),

Y-Yhat
Hat means an estimator of the parameter
Y hat is predicted (by a + bX) value of Y; plug
in x given a and b to get Y hat
Since a regression model includes K
parameters (a and b in simple regression), the
degrees of freedom is N-K
Numerator is SSE in the ANOVA table
2
(
Y
Y
)
SSE
i
i
2
2
se
MSE
N K
N K
(c) 2007 IUPUI SPEA K300 (4392)
Illustration (1)
No
x-xbar
y-ybar
(x-xb)(y-yb)
(x-xbar)^2
yhat
(y-yhat)^2
43
128
-14.5
-8.5
123.25
210.25
122.52
30.07
48
120
-9.5
-16.5
156.75
90.25
127.34
53.85
56
135
-1.5
-1.5
2.25
2.25
135.05
0.00
61
143
3.5
6.5
22.75
12.25
139.88
9.76
67
141
9.5
4.5
42.75
90.25
145.66
21.73
70
152
12.5
15.5
193.75
156.25
148.55
11.87
Mean
57.5
136.5
Sum
345
819
541.5
561.5
(Y Y )
127.2876
127 .2876
31 .8219
SSE=127.2876, MSE=31.8219
NK
62
2
31.8219
2
Var(b) Var( )
.
0567
.
2381
( X i X )2 561.5
2
e
1 57.52
X2
13.88092
Var(a)
31.8219
2
n (X X )
6 561.5
i
(c) 2007 IUPUI SPEA K300 (4392)
Illustration (2): Test b
How to test whether beta is zero (no effect)?

Like y, and follow a normal distribution; a
and b follows the t distribution
b=.9644, SE(b)=.2381,df=N-K=6-2=4
Hypothesis Testing
1. H0:=0 (no effect), Ha:0 (two-tailed)

2. Significance level=.05, CV=2.776, df=6-2=4
3. TS=(.9644-0)/.2381=4.0510~t(N-K)
4. TS (4.051)>CV (2.776), Reject H0
5. Beta (not b) is not zero. There is a significant
impact of X on Y
1 1 t 2 se
1
.9644 2.776 .2381
( X i X )2
(c) 2007 IUPUI SPEA K300 (4392)
Illustration (3): Test a
How to test whether alpha is zero?

Like y, and follow a normal distribution; a
and b follows the t distribution
a=81.0481, SE(a)=13.8809, df=N-K=6-2=4
Hypothesis Testing
1. H0:=0, Ha:0 (two-tailed)

2. Significance level=.05, CV=2.776
3. TS=(81.0481-0)/.13.8809=5.8388~t(N-K)
4. TS (5.839)>CV (2.776), Reject H0
5. Alpha (not a) is not zero. The intercept is
discernable from zero (significant intercept).
1
X2
0 0 t 2 se
.81.0481 2.776 13.8809

n ( X i X )2
(c) 2007 IUPUI SPEA K300 (4392)
Questions
How do we test H0: 0()=1=2 =0?

Remember that t-test compares only two
group means, while ANOVA compares more
than two group means simultaneously.
The same thing in linear regression.
Construct the ANOVA table by partitioning
variance of Y; F test examines the above H0
The ANOVA table provides key information of
a regression model
(c) 2007 IUPUI SPEA K300 (4392)
Partitioning Variance of Y (1)

150
Y hat = 81.048 + .964X, Ybar=136.5
140
Yi
Yhat=81+.96X
120
130
Ybar=136.5
40
50
60
x
Fitted values
(c) 2007 IUPUI SPEA K300 (4392)
70

yi y yi y yi yi

Total
Model
Re sidual ( Error )
2
2
2
(
y
y
)
(
y
y
)
(
y
y
)
i
i
i
i
Total
Model
Re sidual ( Error )
SSM (Yi Y ) 2
i 1
SSE (Yi Y ) 2
i 1
s2
(Y Y )
i 1
NK
SSE
MSE
NK
SST SS y (Yi Y)2 Yi 2 nY 2

(c) 2007 IUPUI SPEA K300 (4392)

81+.96X
No
yhat
(y-ybar)^2
(yhat-ybar)^2
(y-yhat)^2
43
128
122.52
72.25
195.54
30.07
48
120
127.34
272.25
83.94
53.85
56
135
135.05
2.25
2.09
0.00
61
143
139.88
42.25
11.39
9.76
67
141
145.66
20.25
83.94
21.73
70
152
148.55
240.25
145.32
11.87
Mean
57.5
136.5
SST
SSM
SSE
Sum
345
819
649.5000
522.2124
127.2876
122.52=81+.9643, 148.6=.81+.9670
SST=SSM+SSE, 649.5=522.2+127.3
(c) 2007 IUPUI SPEA K300 (4392)
ANOVA Table
H0: all parameters are zero, 0 = 1 = 0
Ha: at least one parameter is not zero
CV is 12.22 (1,4), TS>CV, reject H0
Sources
Sum of Squares
DF
Mean Squares
Model
SSM
K-1
MSM=SSM/(K-1)
MSM/MSE
Residual
SSE
N-K
MSE=SSE/(N-K)
Total
SST
N-1
Sum of Squares
DF
Mean Squares
Model
522.2124
522.2124
16.41047
Residual
127.2876
31.8219
Total
649.5000
Sources
(c) 2007 IUPUI SPEA K300 (4392)
R2 and Goodness-of-fit
Goodness-of-fit measures evaluates how well

a regression model fits the data
The smaller SSE, the better fit the model
F test examines if all parameters are zero.
(large F and small p-value indicate good fit)
R2 (Coefficient of Determination) is SSM/SST
that measures how much a model explains the
overall variance of Y.
R2=SSM/SST=522.2/649.5=.80
Large R square means the model fits the data
(c) 2007 IUPUI SPEA K300 (4392)
Myth and Misunderstanding in R2
R square is Karl Pearson correlation coefficient

squared. r2=.89672=.80
If a regression model includes many regressors, R2 is
less useful, if not useless.
Addition of any regressor always increases R2
regardless of the relevance of the regressor
Adjusted R2 give penalty for adding regressors, Adj.
R2=1-[(N-1)/(N-K)](1-R2)
R2 is not a panacea although its interpretation is
intuitive; if the intercept is omitted, R2 is incorrect.
Check specification, F, SSE, and individual parameter
estimators to evaluate your model; A model with
smaller R2 can be better in some cases.
(c) 2007 IUPUI SPEA K300 (4392)
Interpolation and Extrapolation
Confidence interval of E(Y|X), where x is within

the rage of data x; interpolation
Confidence interval of Y|X, where x is beyond
the range of data x; extrapolation
Extrapolation involves penalty and danger,
which widens the confidence interval; less
reliable
y t 2 s
1 ( x x )2
n
SS x
1 ( x x )2
y t 2 s 1
n
SS x
(c) 2007 IUPUI SPEA K300 (4392)

Ols

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Ols

Загружено:

Авторское право:

Доступные форматы

Outline

Least Squares Methods

(c) 2007 IUPUI SPEA K300 (4392)

Linear regression model

(c) 2007 IUPUI SPEA K300 (4392)

(c) 2007 IUPUI SPEA K300 (4392)

Least Squares Method

How to draw such a line based on data points

Least Squares Method

Least Squares Method

(c) 2007 IUPUI SPEA K300 (4392)

Least Squares Method

(c) 2007 IUPUI SPEA K300 (4392)

Least Squares Method

Least squares method minimizes the sum of

Ordinary Least Squares

Linear relationship between Y and Xs

(c) 2007 IUPUI SPEA K300 (4392)

Least Squares Method 1

(Y a bX ) 2 Y 2 a 2 b 2 X 2 2aY 2bXY 2abX

How to get a and b that can minimize the sum

Least Squares Method 2

Compute a and b so that partial derivatives

(c) 2007 IUPUI SPEA K300 (4392)

Least Squares Method 3

(c) 2007 IUPUI SPEA K300 (4392)

Least Squares Method 4

(c) 2007 IUPUI SPEA K300 (4392)

OLS: Example 10-5 (1)

a Y bX 136.5 .9644 57.5 81.0481

OLS: Example 10-5 (2), NO!

6 47634 345 819

Y X X XY 819 20399 345 47634

(c) 2007 IUPUI SPEA K300 (4392)

OLS: Example 10-5 (3)

Y hat = 81.048 + .964X

(c) 2007 IUPUI SPEA K300 (4392)

What Are a and b ?

(c) 2007 IUPUI SPEA K300 (4392)

(c) 2007 IUPUI SPEA K300 (4392)

Properties of OLS estimators

The outcome of least squares method is OLS

How reliable are a and b we compute?

b is a random variable that changes across

w1Y1 w2Y2 ... wnYn

(c) 2007 IUPUI SPEA K300 (4392)

Now, how do we compute the variance

Variance of Y is based on residuals (errors),

(c) 2007 IUPUI SPEA K300 (4392)

Illustration (2): Test b

How to test whether beta is zero (no effect)?

1. H0:=0 (no effect), Ha:0 (two-tailed)

Illustration (3): Test a

How to test whether alpha is zero?

1. H0:=0, Ha:0 (two-tailed)

.81.0481 2.776 13.8809

How do we test H0: 0()=1=2 =0?

Partitioning Variance of Y (1)

Y hat = 81.048 + .964X, Ybar=136.5

(c) 2007 IUPUI SPEA K300 (4392)

Partitioning Variance of Y (2)

SST SS y (Yi Y)2 Yi 2 nY 2

Partitioning Variance of Y (3)

(c) 2007 IUPUI SPEA K300 (4392)