Вы находитесь на странице: 1из 4

Review for Final Exam

New Material ONLY

Tables Provided: Z table (Table III from the book),

New Material:

Chapter 12

1. Be able to determine which are the explanatory variable and the response variable in the
situation.

2. Be able to interpret a scatterplot in terms of pattern, direction, strength, outliers, constant


variance.

3. Be able to state the model for linear regression defining all of the terms and the assumptions
for 
Y = 0 + 1X + 

4. Be able to state the four assumptions for linear regression


a) Given the appropriate graphs, determine if the assumptions are met (you will need to
determine which graphs are appropriate for which assumptions)
b) The assumptions are SRS, linearity, normality of the residuals, constant variance of the
residuals.

5. Be able to calculate SXX, SXY, SYY from the summations


(∑ 𝑥)2 (∑ 𝑥)(∑ 𝑦) (∑ 𝑦)2
𝑆𝑋𝑋 = ∑ 𝑥 2 − , 𝑆𝑋𝑌 = ∑ 𝑥𝑦 − , 𝑆𝑋𝑋 = ∑ 𝑦 2 −
𝑛 𝑛 𝑛

6. Be able to calculate and write down the least squares regression line,
𝑆𝑋𝑌
𝛽̂1 = 𝑏1 = , 𝛽̂ = 𝑏0 = 𝑦̅ − 𝑏1 𝑥̅
𝑆𝑋𝑋 0

7. Be able to write down the least squares regression line from computer output.

8. Be able to make a point prediction using the regression line.

1
9. Be able to calculate the ANOVA table for linear regression via hand (fill in the boxes and
know equations). The summations can be calculated from the slope, SXY, and SYY.
SS MS
Source df
(Sum of Squares) (Mean Square)
𝑛
𝑆𝑆𝑅
Regression 1 ∑(𝑦̂𝑖 − 𝑦̅)2
𝑑𝑓𝑟
𝑖=1
𝑛
𝑆𝑆𝐸
Error n-2 ∑(𝑦𝑖 − 𝑦̂𝑖 )2
𝑑𝑓𝑒
𝑖=1
𝑛

Total n-1 ∑(𝑦𝑖 − 𝑦̅)2


𝑖=1
a) SST = SYY
b) SSR = b1 SXY
c) SST = SSR + SSE
b) dft = dfr + dfe

10. Be able to calculate and interpret the standard deviation about the least squares line (point
estimate)
𝜎̂ = 𝑠 = √𝑀𝑆𝐸

11. Be able to calculate the coefficient of determination from the ANOVA table, write down the
value from the output, and know how the question is phrased
𝑆𝑆𝑅
𝑅2 =
𝑆𝑆𝑇

12. Be able to interpret R2 including what it doesn’t tell you.


a) linearity
b) outliers
c) good prediction

13. Be able to determine if a model utility test can be used or if only a t-test is possible.

14. Be able to calculate the test statistic for the model utility test (F-test) from the following
formula and determine what the value is from the output.
𝑀𝑆𝑅
𝐹𝑡𝑠 =
𝑀𝑆𝐸

15. Be able to write down the degrees of freedom for the F-test statistic from the ANOVA table,
the computer output, or direct calculation.
df1 = dfr = 1, df2 = dfe = n – 2

16. Be able to determine the p-value from the linear regression output and the code that is used
to calculate it directly from the test statistic and degrees of freedom.

17. Be able to perform the hypothesis test for association (model utility test) using the complete
four-steps from class.

2
18. Be able to calculate and interpret the confidence interval for 1 given the appropriate
values/output. The output could consist the value for the individual variables or the interval
itself.
𝑀𝑆𝐸
𝑏1 ± 𝑡𝛼⁄2,𝑑𝑓 𝑆𝐸𝑏1 = 𝑏1 ± 𝑡𝛼⁄2,𝑑𝑓 √
𝑆𝑋𝑋
df = dfe = n - 2

19. Be able to state the complete four-steps for hypothesis test for 1. The test statistic is
determined algebraic and/or via computer output.
𝑏1 − 𝛽10 𝑀𝑆𝐸
𝑡𝑡𝑠 = , 𝑆𝐸𝑏1 = √
𝑆𝐸𝑏1 𝑆𝑋𝑋

20. Be able to state the similarities and differences between objectives 17, 18, and 19.

21. Be able to calculate the sample correlation


𝑆𝑥𝑦
𝑎) 𝑟 =
√𝑆𝑥𝑥 𝑆𝑦𝑦
b) r = √𝑟 2 with the sign of b1 (slope)

22. Be able to interpret r


a) What happens when you switch X and Y
b) Sign of r
c) What is meant by uncorrelated (r = 0)
d) How r relates to the association of X and Y
e) Remember that correlation doesn't provide information on form.

23. Be able to determine and explain when you cannot use linear regression because of
extrapolation or other reasons.

24. Be able to determine in a particular situation if there is causality in linear regression. Stating
that association does not mean causality does not completely answer the question.

25. Be able to calculate and interpret the confidence interval for the mean value at x = x* either
given the value of the standard error or via output.
a) The estimated value is 𝜇̂ 𝑥 *
1 (𝑥 ∗ − 𝑥̅ )2
𝑏) 𝑆𝐸𝜇̂∗ = √𝑀𝑆𝐸 [ + ]
𝑛 𝑆𝑋𝑋
even though you will not be performing any calculations using the SE, you still need to
know the formula
c) df = n - 2

3
26. Be able to calculate and interpret the prediction interval at x = x* * either given the value of
the standard error or via output.
a) The estimated value is the ŷx*
1 (𝑥 ∗ − 𝑥̅ )2
𝑏) 𝑆𝐸𝑦̂∗ = √𝑀𝑆𝐸 [1 + + ]
𝑛 𝑆𝑋𝑋
even though you will not be performing any calculations using the SE, you still need to
know the formula
c) df = n - 2

27. Be able to state the difference between the confidence interval for the mean response at
x = x* and the prediction interval for a particular value and when each interval would be
used.

28. Given a particular model, determine if it can be used for multiple linear regression and state
the reason why or why not.

29. Given the R output, be able to write down the equation for the line using multiple linear
regression.

30. Given the R output, be able to perform the significance test to determine if there is an
association with any of the explanatory variables using multiple linear regression
a) F test or model utility test

31. Given the R output, be able to determine which of the explanatory variables are linearly
associated with the response variable and perform the appropriate t tests.
a) t test with df = dfe.

General Inference Questions

32. For a specific scenario, be able to identify the best inference method to use.

33. Be able to summarize the information in English sentences for all types of inference.

34. Be able to determine how far a specific situation can be generalized and under what
conditions.

35. Be able to determine the practical consequences of each inference.

Midterms 1 and 2:

See the objectives for Midterms 1 and 2 for a complete list of the objectives. We will be going
over specifics in class on Friday.

Вам также может понравиться