Вы находитесь на странице: 1из 6

/

Part 1 (Each question in this part is worth 5 points, 70 points total.)

You have collected data on two variables. The x-variable is the number of hours spent studying for an
exam, and the y-variable is the score on the exam. You have 24 observations.

You calculate the following values using Excel in order to calculate the OLS regression "by hand."

Ixi == 180.5 Ir;== 4485.3


Icx;-x) 2
==33.4 Icr;-rY ==15814.4
L(X; -X)(Y;-Y) == 353.1

Also, in case this didn't make it onto your formula sheet, recall that it can be shown that the Regression
Sum of Squares (SSR) is: SSR == p"2"" - 2
1 L..J (X; -X) . You'll want to use it.

1) What is X?

- Ixi _ 180.5 == 7.52


x == --
n
- 24

2) What is the sample variance of X?

sz - L(X; -X)2 == 33.4 ==l.45


x - n-1 23

3) What is the sample variance of X?


S2- - S1Jc 1.45
x ---;; == 24 == .06

4) What is the sample correlation coefficient of X and Y?

We need a few intermediate calculations to get to this one:

cov(X,Y) == L(X; -X)(Y; -Y) 353.1


n-1 ==23==15.35

sx == /Sf == .JI.45 == 1.21


Sy == JS[ == ,/I (Y;n-1- Y) 2
== ~15814.4 == 26.22
23
Thus the correlation coefficient is:
r = SXY = 15.35 =.4 9
XY sXSY 1.21*26.22
A

5) What is the value of p1 ?

A I<Xi -X)(Y;-Y) = 353.1 =10.57


P1 = I<Xi -X)2 33.4

6) Interpret, using words, the meaning of p1 in this particular regression.


As the numbers of hours of study increases by 1, the exam score is predicted to increase by 10.57
points.

7) What is the value of Po ?

y = 4485.3 = 186.89
24
Po= Y -/J X = 186.89-10.57*7.52=107.43
1

8) Suppose one observation (call it observation 0) in the sample is Xo=6, Yo=l 74. What is the value
of i 0 ?

The predicted value will be Ya = /30 + /31X 0 = 107.43+10.57*6=170.85


The error is the difference between the actual and predicted values: 174-170.85=3.15

9) What is the value of ii? That is, the average of the sample regression errors.

Zero. The normal equation requires L ii = 0. Divide this number by n to get the sample
average and it's still zero.
10) What is the value of SJ? That is, the variance of the regression. (Hint: don't forget to calculate
SSE.)

SUBSTANTIAL ROUNDING ERRORS ARE GOING TO START TO CREEP IN NOW.


STUDENTS ARE NEVER TO BE PENALIZED FOR DOING THE RIGHT WORK BUT
GETTING NUMBERS A LITTLE DIFFERENT FROM WHAT I HAVE.

The clue to use SSR = /Jt[/Xi -X) 2 is critical now. SSR = 10.57 2 * 33.42 = 3730.5

SSE=SST-SSR=15814.37-3730.5=12083.87

Finally, SJ= SSE = 12083.87


n-2 = 549.27
22

11) What is the value of Sfii ?

s~S;
2 549.27 = 16.44
SA= _Lcxi-X)2 33.42

12) What is the test statistic for a null hypothesis that p1 =5?

t = P1 -Pi _ 10.57- 5
0 SA - .J16.44 = 1.373

13) Will you reject the null hypothesis at a level of a=.10? Why or why not?

No. The critical value is larger than this test statistic. (1.717 is the critical value, to be precise.)

14) Calculate the 95% confidence interval for p1

A I-~
P1 = P1 t 2
S6 =10.572.074*.J16.44=10.57 8.41
Part 2 (30 points total)

1) ( 10 points) Suppose a friend of yours is in another social science statistics class - far inferior to
ours, of course. He claims that in his class they don't care about this regression stuff but only
focus on the sample correlation coefficient to analyze the relationship between two variables.

What is a correlation-only type of analysis capable of doing? What will it not be able to do?
What advantages does the regression approach offer over a correlation-only approach? State
your answers in a clearly written paragraph (or two).

A correlation alone can tell you two things: The direction of the relationship between X and Y
(positive or negative), and the strength of that relationship. This latter point may not be obvious.
R 2 is a measure of how well the X variable "fits" the variable Y. R 2 is also the square of the sample
correlation coefficient. This implies that correlation alone can therefore tell us how strong a
relationship there is between the two. [3 Points total for getting either (or both) of the underlined
ideas. They don't need to use the exact same words. It's the idea that matters. If they come up
with other things that you believe are reasonable, you can give credit. These are what I can think
of]

The correlation alone does not tell us the marginal effect of X on Y (how Y changes as you change
X). For that we need the regression slope term. Correlation also does not have any structure. You
don't have to go through the process of identifying a dependent or independent variable- it's that
process, remember, that gives a regression its interpretive value. Furthermore, with regression we
are able to offer predicted values of the dependent variable. Lastly, we can test the regression
estimators using appropriate t-tests for statistical significance. [4 points for getting one of the
underlined idea, 3 points more for another. 7 Points max. They don't need to use the exact same
words. It's the idea that matters. If they come up with other things that you believe are
reasonable, you can give credit. These are what I can think of.]

[It also is not necessary to divide the answers into two parts as I did. As long as the ideas come
through I don't care.]
2) (12 points total (3 points each part)) Compare the following two regressions:

i. Y; =Po + P1X; + &;


1 A A

ii. -Y; =Po+ P1X; +&;


4

Equation i is the standard regression we've been working with thus far, so all the formulas we've
derived thus far apply. In equation ii the dependent variable has been multiplied by .25. How
does this change, if at all, the values of:

(You need to give us some indication of how you arrive at your answers. I recommend answering
these in the order given.)

Let a * denote the changed value of the terms we are finding for equation ii.

a) P1
A s~ ~)xi -X)(Y;-Y)
P1 = s; = L(X; -x)2

A._ s~ _ L(X; -X)(.25Y;-.25Y) _ .2sI(X; -X)(Y;-Y)


P1 - s; - L(X;-x)2 - L(X;-x)2

Slope falls by a factor of 4. (Or is .25 of original.)

b) Po

Po= f-P1X

A. = .25Y- - P"-
Po
- "-
1X = .25(Y - P1X)

Intercept falls by a factor of 4 (.25 of original).


c) SSE

SSE= L.J
" '(Y;
" -Y;) 2 =" (/3"
L.J ' 0 +"
/31X; -Y;) 2

SSE = "'
L.J(Y;" -Y;)
2 = "'
L.J(.25/3"0 +.25/)"1 X; -.25Y;) 2 =.25 2"'
L.J(/3"0 +/3"1 X; -Y;) 2 =-L.J(/3
1 "' " +/3" X; -Y;) 2
0 1
16

SSE Decreases by a factor of 16. (.0625 of original)

d) R 2

R2 = ssR = IcY;-ry
SST L
(Y; - Y) 2

2 Ic.2sY;-.2sf) 2 .2s 2IcY;-f) 2


ssR 2
R =--.="' 2= 2 2=R
SST L.J (.25Y; - .25Y) .25 (Y; - Y) L
2
R unchanged because all the Sum of Squares terms change by the same factor (1/16).

3) (8 points) In the OLS regression ofY on X, what is the value of Li;Y; ? (Show your work.)

Zero.

I&;Y; = Li;(/Jo + P1X;) =Po LB;+ P1LB;X; = 0

Because the normal equations from the OLS derivation require that both
summation terms equal zero.

(4 points for getting the zero answer. But to get any more points the student MUST
expand the summation and use the normal equations to get the zeroes. Just saying
that the sum of errors is zero is not correct.]

._ ----~---- -----

Вам также может понравиться