Академический Документы
Профессиональный Документы
Культура Документы
Introducing Regression Introducing the Regression Line The Uses of Regression Calculating the Regression Line The Accuracy of a Line Identifying the Regression Line Refining Regression Quantifying the Predictive Power of Regression Residual Analysis Interpreting the Regression Coefficients Revisiting R2 and p
Page 1 of 5
Notes
Page 2 of 5
Notes
Page 3 of 5
Notes
Residual Analysis
A complete regression analysis should include a careful inspection of the residuals. Plot the residuals against the independent variable to reveal patterns in the distribution of the residuals.
Residual = y y
Plot residuals against the independent variable to reveal patterns. If underlying relationship is linear, residuals follow a normal distribution with mean zero and a constant variance. Residual plots reveal nonlinear relationships and heteroskedasticity. Heteroskedasticity: variance of the residual distribution changes with value of the independent variable.
Page 4 of 5
Notes
Regression line coefficients are estimates based on sample data. Confidence intervals specify a range of likely values for each coefficient. If the true slope is 0 no linear relationship
No linear relationship
Hypothesis test: o Reported p-value indicates likelihood that a coefficient is zero. o p-value < 0.05 linear relationship
Revisiting R and p
The p-value and R2 provide different information. A linear relationship can be significant but not explain a large percentage of the variation, so having a low p-value does not ensure a high R2. Sample size is an important determinant of regression accuracy: as with all sampling, larger samples give more accurate estimates.
R2 = percent of variation in dependent variable explained by relationship with independent variable p = probability that there is no linear relationship between the dependent and independent variables Larger sample size more accurate estimates
Page 5 of 5