Вы находитесь на странице: 1из 4

1 Clarity on Terminology

1.1 Setup
Let given observed sample set be {(x1 , y1 ), (x2 , y2 ), · · · (xN , yN )}.

ˆ X, Y are random variables that can take on any value (xi , yi ) within range of sample set.

ˆ If θ̂ would be estimator of estimand θ then θ̂(x) would be estimate of estimand θ at x 1

ˆ One variable is always independent or regressor or predictor variable, typically X and another
is dependent or regressand or predicted variable, typically Y

ˆ We predict Y , not estimate. Prediction is different from Estimation 2

Due to frequent usage, for simplicity, let us define,

N
X
Sxy = Syx = (xi − x)2 (yi − y)2 constant (1)
i=1
N
X
Sxx = (xi − x)2 constant (2)
i=1
XN
Syy = (yi − y)2 constant (3)
i=1

Do not confuse them with sample standard deviation estimator S

1.2 Population Regression Function, PRF


Given a population (X, Y ) we hypothesize underlying population has a regression line as
follows. The conditional expectation is

E(Y |x) = β0 + β1 x PRF (4)

The above equation is called Population Regression Function, PRF. Including the error ε, the
prediction of dependent variable would be

Y = E(Y |x) + ε Prediction (5)

which is called simple linear regression model for population.E(Y |x) is often hypothetical be-
cause we would not know β0 , β1 unless we know population. We do not care about distribution of
Y (µY , σY2 ) here as regression is always one sided 3 . For Y , we do the other way, but that is another
story in similar lines.

ˆ RV(Parameters): ε(0, σ 2 ), X(µX , σX


2 ), Y |x(µ 2
Y |x , σY |x )

1
https://en.wikipedia.org/wiki/Estimator
2
https://stats.stackexchange.com/a/17789/202481
3
unless we standardize dataset, which leads to symmetry and correlation coefficient

1
ˆ Other Main Parameters: β0 , β1

ˆ All Parameters are constants (and typically unknown for population)

ˆ Distribution: ε assumed to have normal distribution N (0, σ 2 )

1.3 Sample Regression Function, SRF


1.3.1 Point Estimates from single SRF
Given a sample set (X, Y ), we estimate underlying population has a regression line as follows.

Ŷ = βˆ0 + βˆ1 x SRF, Estimator of RV E(Y |x), not Y (6)


ε̂ = Y − Ŷ Estimator of RV ε (7)

For given sample (xi , yi ) from sample set (X, Y ), a fitted value and residual are

yˆi = Ŷ (xi ) = b0 + b1 xi Fitted value, Estimate of RV E(Y |x) at xi (8)


εˆi = yi − yˆi Residual, Estimate of RV ε at (xi , yi ) (9)

Using OLS,

P
(x,y) (y − Y )(x − X)
β̂1 = Slope RV, Estimator of RV β1 (10)
− X)2
P
x (x
β̂0 = Y − β̂1 X y-intercept RV, Estimator of RV β0 (11)
P
(yi − y)(xi − x) Sxy
b1 = i P 2
= Slope constant, Estimate of RV β1 (12)
i (xi − x) Sxx
b0 = y − b1 x y-intercept constant, Estimate of RV β0 (13)

ˆ βˆ0 , βˆ1 are estimators of β0 , β1 for any sample set. b0 , b1 are estimates of β0 , β1 for given sample
set

ˆ Estimator(Estimates): ε̂(0, s2 ), X̂(x, s2X ), Ŷ (ŷ = y, s2Y |x = s2 ), βˆ1 (b1 ), βˆ0 (b0 )

ˆ All Estimators are Random Variables. All Estimates are constants.

ˆ Distribution: ε̂ assumed to have normal distribution N (0, s2 )

Error sum of squares or residual sum of squares, SSE:

N
X N
X
SSE = (yi − yˆi )2 = [yi − (b0 + b1 xi )]2 constant (14)
i=1 i=1

Variance Estimation σ:

− Ŷ )2
P
2 2 y (y
S = σ̂ = RV, Variance Estimator of RV ε (15)
n−2

2
where n − 2 is the degrees of freedom because it requires βˆ0 , βˆ1 to be calculated (in other words,
β0 , β1 to be estimated) before summation.

PN
2 − yˆi )2
i=1 (yi SSE
s = = constant, Variance Estimate of RV ε (16)
n−2 n−2
ˆ S 2 is estimator of σ 2 for any sample set. s2 is estimate of σ 2 for given sample set

ˆ S 2 is unbiased estimator (while S is not).

Total sum of squares, SST:

N
X
SST = Syy = (yi − y)2 constant (17)
i=1

Coefficient of determination, rd2 : (to differentiate from Pearson’s correlation coefficient r)

SSE
rd2 = 1 − constant (18)
SST
0 ≤ rd ≤ 1 (19)
2
r = rd2 where r is sample correlation coefficient (20)

1.3.2 Estimates from Multiple SRFs


Here, we wonder if βˆ1 is a random variable, and when we have multiple estimates, what would
be the point and interval estimates of resultant distribution.

Estimand β1 E(βˆ1 ) = µβˆ1 V (βˆ1 ) = σβˆ1


Estimator βˆ1 µ
d βˆ1 σcβˆ1
Estimate b1 sβˆ1

Note in above table, for columns 2 and 3, the estimand is parameter of estimator βˆ1 itself, not
that of β1 . That is, we are interested in the mean and variance of estimator βˆ1 .

Assumption: X fixed for all sample sets so only corresponding Y is RV.

P
(x,y) (x − X)(y − Y ) X
β̂1 = = cy Slope RV, Estimator of RV β1 (21)
X)2
P
x (x − y

x−X
c= P 2
constant (22)
x (x − X)

PN N
i (xi − x)(yi − y) X
b1 = PN = ci yi Slope constant, Estimate of RV β1 (23)
2
i (xi − x) i
xi − x
ci = PN constant (24)
i (xi − x)2

3
Because each Yi is normal (as underlying ε is normal), β̂1 also should be normal.

Mean of βˆ1 :

µβˆ1 = E(βˆ1 ) = β1 RV (25)

Variance of βˆ1 :

σ2
σβ̂2 = V (βˆ1 ) = P 2
RV (26)
x (x − x)
1

S2
Sβ2ˆ = σc
βˆ1 = P 2
RV, Variance Estimator of RV σβ̂1 (27)
x (x − x)
1

s2 s2
s2βˆ = PN = constant, Variance Estimate of RV σβ̂1 (28)
1
i (xi − x)
2 Sxx

ˆ Sβ2ˆ is estimator of σβ̂2 for resultant any sampling distribution of βˆ1 or multiple SRFs
1 1

ˆ s2ˆ is estimate of σ 2 for resultant given sampling distribution of βˆ1 or multiple SRFs
β1 β̂1

From here, Confidence intervals and Hypothesis testing procedures for β1 could be built (im-
mediate next step would be seeing standardized β̂1 having t distribution with df N − 2)

1.4 Correlation Coefficient


Given a sample set (X, Y ), the sample correlation coefficient is given by

Sxy
r=√ p
Sxx Syy

Вам также может понравиться