Академический Документы
Профессиональный Документы
Культура Документы
1
F. Reporting the Results of a Statistical Analysis STAT 840: Linear Regression
B. Justify transformations
C. Justify higher order terms
D. Justify model assumptions (e.g., linearity, normality, independence, etc.)
with
Yi = β0 + β1 Xi + εi , i = 1, 2, . . . , 60
where εi ∼ iidN (0, σ 2 ) and β0 , β1 , σ 2 are the unknown parameters of in-
terest.
II. Preliminary Analyses.
A. A scatterplot indicates a negative linear relationship between age and mus-
cle mass may be useful (r = −0.87).
A. One observation (53) has a studentized and studentized deleted residual
larger than 2. We are unable to determine with certainty any reasons for
excluding the observation. However, a sensitivity analysis shows that the
model fit is robust with respect to its inclusion (Appendix).
B. The assumptions of linearity, constant error variance, independence of error
terms, and normality of error terms are assessed via residual plots and
hypothesis tests. All model assumptions are reasonably met (Appendix).
III. Statistical Analysis. The lm() function in R was used to estimate the least
squares fit of the simple linear regression model. The final fitted model is given
by
Ŷ = −156.35 − 1.19X
and is shown in Figure 2 (solid line). The model MSE is 66.8. Women’s age
explains 75% of the variation in muscle mass (R2 = 0.75).
1. A two-sided t−test is used to select between H0 : β1 = 0 and H1 : β1 6= 0
with a type I error rate of α = 0.05. Since b1 = −1.19 is more than 13 SE
below what is expected if H0 : β1 = 0 (t(58) = −13.19, p < 0.001), H0 is
unlikely to have produced the observed data and is rejected. Thus, there
is sufficient evidence of a linear association between age and muscle mass.
2
F. Reporting the Results of a Statistical Analysis STAT 840: Linear Regression
Figure 2: A scatterplot of muscle mass and age including the least squares line (solid)
and 95% confidence band (dashed)
2. A 95% confidence band on the regression line is used to determine the pre-
cision with which the line is estimated (Figure 2)3 It is apparent from the
figure that the regression line has been estimated fairly precisely. The slope
of the regression line is clearly negative, and the levels of the regression
3 The level of confidence indicates the proportion of time that the estimating procedure will yield
3
F. Reporting the Results of a Statistical Analysis STAT 840: Linear Regression
4
F. Reporting the Results of a Statistical Analysis STAT 840: Linear Regression
adjacent space.
8 An examination of the scatterplot of the data with the fitted line overlaid would be helpful here.
5
F. Reporting the Results of a Statistical Analysis STAT 840: Linear Regression
Figure 6: A scatterplot of residuals versus age with the expected value of 0 indicated
by a dotted line
6
F. Reporting the Results of a Statistical Analysis STAT 840: Linear Regression
7
F. Reporting the Results of a Statistical Analysis STAT 840: Linear Regression
8
F. Reporting the Results of a Statistical Analysis STAT 840: Linear Regression