29 Regression Ext

1 Clarity on Terminology
1.1 Setup
Let given observed sample set be {(x1 , y1 ), (x2 , y2 ), · · · (xN , yN )}.
X, Y are random variables that can take on any value (xi , yi ) within range of sample set.
If θ̂ would be estimator of estimand θ then θ̂(x) would be estimate of estimand θ at x 1
One variable is always independent or regressor or predictor variable, typically X and another
is dependent or regressand or predicted variable, typically Y
We predict Y , not estimate. Prediction is different from Estimation 2
Due to frequent usage, for simplicity, let us define,
N
X
Sxy = Syx = (xi − x)2 (yi − y)2 constant (1)
i=1
N
X
Sxx = (xi − x)2 constant (2)
i=1
XN
Syy = (yi − y)2 constant (3)
i=1
Do not confuse them with sample standard deviation estimator S
1.2 Population Regression Function, PRF

Given a population (X, Y ) we hypothesize underlying population has a regression line as
follows. The conditional expectation is
E(Y |x) = β0 + β1 x PRF (4)
The above equation is called Population Regression Function, PRF. Including the error ε, the
prediction of dependent variable would be
Y = E(Y |x) + ε Prediction (5)
which is called simple linear regression model for population.E(Y |x) is often hypothetical be-
cause we would not know β0 , β1 unless we know population. We do not care about distribution of
Y (µY , σY2 ) here as regression is always one sided 3 . For Y , we do the other way, but that is another
story in similar lines.
RV(Parameters): ε(0, σ 2 ), X(µX , σX

2 ), Y |x(µ 2
Y |x , σY |x )
1
https://en.wikipedia.org/wiki/Estimator
2
https://stats.stackexchange.com/a/17789/202481
3
unless we standardize dataset, which leads to symmetry and correlation coefficient
1
Other Main Parameters: β0 , β1
All Parameters are constants (and typically unknown for population)
Distribution: ε assumed to have normal distribution N (0, σ 2 )
1.3 Sample Regression Function, SRF

1.3.1 Point Estimates from single SRF
Given a sample set (X, Y ), we estimate underlying population has a regression line as follows.
Ŷ = βˆ0 + βˆ1 x SRF, Estimator of RV E(Y |x), not Y (6)

ε̂ = Y − Ŷ Estimator of RV ε (7)
For given sample (xi , yi ) from sample set (X, Y ), a fitted value and residual are
yî = Ŷ (xi ) = b0 + b1 xi Fitted value, Estimate of RV E(Y |x) at xi (8)

εî = yi − yî Residual, Estimate of RV ε at (xi , yi ) (9)
Using OLS,
P
(x,y) (y − Y )(x − X)
β̂1 = Slope RV, Estimator of RV β1 (10)
− X)2
P
x (x
β̂0 = Y − β̂1 X y-intercept RV, Estimator of RV β0 (11)
P
(yi − y)(xi − x) Sxy
b1 = i P 2
= Slope constant, Estimate of RV β1 (12)
i (xi − x) Sxx
b0 = y − b1 x y-intercept constant, Estimate of RV β0 (13)
βˆ0 , βˆ1 are estimators of β0 , β1 for any sample set. b0 , b1 are estimates of β0 , β1 for given sample
set
Estimator(Estimates): ε̂(0, s2 ), X̂(x, s2X ), Ŷ (ŷ = y, s2Y |x = s2 ), βˆ1 (b1 ), βˆ0 (b0 )
All Estimators are Random Variables. All Estimates are constants.
Distribution: ε̂ assumed to have normal distribution N (0, s2 )
Error sum of squares or residual sum of squares, SSE:
N
X N
X
SSE = (yi − yî )2 = [yi − (b0 + b1 xi )]2 constant (14)
i=1 i=1
Variance Estimation σ:
− Ŷ )2
P
2 2 y (y
S = σ̂ = RV, Variance Estimator of RV ε (15)
n−2
2
where n − 2 is the degrees of freedom because it requires βˆ0 , βˆ1 to be calculated (in other words,
β0 , β1 to be estimated) before summation.
PN
2 − yî )2
i=1 (yi SSE
s = = constant, Variance Estimate of RV ε (16)
n−2 n−2
S 2 is estimator of σ 2 for any sample set. s2 is estimate of σ 2 for given sample set
S 2 is unbiased estimator (while S is not).
Total sum of squares, SST:
N
X
SST = Syy = (yi − y)2 constant (17)
i=1
Coefficient of determination, rd2 : (to differentiate from Pearson’s correlation coefficient r)
SSE
rd2 = 1 − constant (18)
SST
0 ≤ rd ≤ 1 (19)
2
r = rd2 where r is sample correlation coefficient (20)
1.3.2 Estimates from Multiple SRFs

Here, we wonder if βˆ1 is a random variable, and when we have multiple estimates, what would
be the point and interval estimates of resultant distribution.
Estimand β1 E(βˆ1 ) = µβˆ1 V (βˆ1 ) = σβˆ1

Estimator βˆ1 µ
d βˆ1 σcβˆ1
Estimate b1 sβˆ1
Note in above table, for columns 2 and 3, the estimand is parameter of estimator βˆ1 itself, not
that of β1 . That is, we are interested in the mean and variance of estimator βˆ1 .
Assumption: X fixed for all sample sets so only corresponding Y is RV.
P
(x,y) (x − X)(y − Y ) X
β̂1 = = cy Slope RV, Estimator of RV β1 (21)
X)2
P
x (x − y
x−X
c= P 2
constant (22)
x (x − X)
PN N
i (xi − x)(yi − y) X
b1 = PN = ci yi Slope constant, Estimate of RV β1 (23)
2
i (xi − x) i
xi − x
ci = PN constant (24)
i (xi − x)2
3
Because each Yi is normal (as underlying ε is normal), β̂1 also should be normal.
Mean of βˆ1 :
µβˆ1 = E(βˆ1 ) = β1 RV (25)
Variance of βˆ1 :
σ2
σβ̂2 = V (βˆ1 ) = P 2
RV (26)
x (x − x)
1
S2
Sβ2ˆ = σc
βˆ1 = P 2
RV, Variance Estimator of RV σβ̂1 (27)
x (x − x)
1
s2 s2
s2βˆ = PN = constant, Variance Estimate of RV σβ̂1 (28)
1
i (xi − x)
2 Sxx
Sβ2ˆ is estimator of σβ̂2 for resultant any sampling distribution of βˆ1 or multiple SRFs
1 1
s2ˆ is estimate of σ 2 for resultant given sampling distribution of βˆ1 or multiple SRFs
β1 β̂1
From here, Confidence intervals and Hypothesis testing procedures for β1 could be built (im-
mediate next step would be seeing standardized β̂1 having t distribution with df N − 2)
1.4 Correlation Coefficient

Given a sample set (X, Y ), the sample correlation coefficient is given by
Sxy
r=√ p
Sxx Syy

29 Regression Ext

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

29 Regression Ext

Загружено:

Авторское право:

Доступные форматы

1 Clarity on Terminology

 If θ̂ would be estimator of estimand θ then θ̂(x) would be estimate of estimand θ at x 1

 We predict Y , not estimate. Prediction is different from Estimation 2

Due to frequent usage, for simplicity, let us define,

Do not confuse them with sample standard deviation estimator S

1.2 Population Regression Function, PRF

E(Y |x) = β0 + β1 x PRF (4)

Y = E(Y |x) + ε Prediction (5)

 RV(Parameters): ε(0, σ 2 ), X(µX , σX

 All Parameters are constants (and typically unknown for population)

 Distribution: ε assumed to have normal distribution N (0, σ 2 )

1.3 Sample Regression Function, SRF

Ŷ = βˆ0 + βˆ1 x SRF, Estimator of RV E(Y |x), not Y (6)

yˆi = Ŷ (xi ) = b0 + b1 xi Fitted value, Estimate of RV E(Y |x) at xi (8)

 All Estimators are Random Variables. All Estimates are constants.

 Distribution: ε̂ assumed to have normal distribution N (0, s2 )

Error sum of squares or residual sum of squares, SSE:

 S 2 is unbiased estimator (while S is not).

Total sum of squares, SST:

Coefficient of determination, rd2 : (to differentiate from Pearson’s correlation coefficient r)

1.3.2 Estimates from Multiple SRFs

Estimand β1 E(βˆ1 ) = µβˆ1 V (βˆ1 ) = σβˆ1

Assumption: X fixed for all sample sets so only corresponding Y is RV.

µβˆ1 = E(βˆ1 ) = β1 RV (25)

1.4 Correlation Coefficient

Вам также может понравиться

If θ̂ would be estimator of estimand θ then θ̂(x) would be estimate of estimand θ at x 1

We predict Y , not estimate. Prediction is different from Estimation 2

RV(Parameters): ε(0, σ 2 ), X(µX , σX

All Parameters are constants (and typically unknown for population)

Distribution: ε assumed to have normal distribution N (0, σ 2 )

All Estimators are Random Variables. All Estimates are constants.

Distribution: ε̂ assumed to have normal distribution N (0, s2 )

S 2 is unbiased estimator (while S is not).