Вы находитесь на странице: 1из 13

Stepwise Multiple Regression

In multiple regression, we are interested in predicting a criterion variable from a set of


predictors. The REGRESSION procedure provides five methods to select predictor
variables. They are forward selection, backward elimination, stepwise selection, forced
entry, and forced removal. To learn more choose Help / Topics from the SPSS main
menus. Click the Index tag and type stepwise selection in the textbox. The index
`stepwise selection in Linear Regression` will be highlighted. Next, click the Display
button.

• Forward selection begins with no predictors in the regression equation. The


predictor variable that has the highest correlation with the criterion variable is
entered into the equation first. The rest variables are entered into the equation
depending on the contribution of each predictor.

• Backward elimination begins with all predictor variables in the regression


equation and sequentially removes them. Two removal criteria are available.

• Stepwise selection is a combination of forward and backward procedures.

Step 1

The first predictor variable is selected in the same way as in forward selection. If
the probability associated with the test of significance is less than or equal to the
default .05, the predictor variable with the largest correlation with the criterion
variable enters the equation first.

Step 2

The second variable is selected based on the highest partial correlation. If it can
pass the entry requirement (PIN=.05), it also enters the equation.

Step 3

From this point, stepwise selection differs from forward selection: the variables
already in the equation are examined for removal according to the removal
criterion (POUT=.10) as in backward elimination.

Step 4

Variables not in the equation are examined for entry. Variable selection ends
when no more variables meet entry and removal criteria.

• Multiple Regression, Hierarchical Multiple Regression, and Stepwise Multiple


Regression

In a (standard) multiple regression analysis, the researcher decides how many


predictors to enter and all the predictors enter the regression model
simultaneously.

In a hierarchical multiple regression, the researcher decides not only how many
predictors to enter but also the order in which they enter. Usually, the order of entry is
based on logical or theoretical considerations.
In a stepwise multiple regression analysis, the number of predictors to be selected and the
order of entry are both decided by statistical criteria (e.g., entry or removal criterion).

A corporation is interested in predicting job satisfaction among its employees. Ten


employees were randomly chosen to fill out the information on the criterion variable (job
satisfaction) and the predictor variables such as salary, job security, ratings of work
environment, years of service, etc. Apply the stepwise method to select the best set of
predictor variables into the regression equation.

SPSS for Windows

A. Enter data: One criterion and three predictor variables.

B. From the menus choose: Analyze \ Regression \ Linear.

First, select Y as the Dependent variable. Second, move the three predictor variables to
the Independent(s) list. Third, click on the down arrow and select the Stepwise Method.
Last, click on Statistics. Choose R squared change, Descriptives, Part and partial
correlations. Click Continue. Click OK.

SPSS Printout

A. Examine the correlation matrix.


Ideally, if each of the predictors is significantly correlated with the criterion variable and
the predictors are not related to each other, a high multiple R will be obtained.

• Examine the cross correlations.

Which predictor variable has the largest correlation with the criterion variable Y?

Answer:

X1 has the largest correlation with the criterion variable Y, r = .858, p = .001.

• Examine the inter-correlations.

Large correlations between the predictor variables can substantially affect the
results of multiple regression analysis. Note that the correlation between X2 and
X3 equals .804.

B. Stepwise Selection Method

Model 1: Regression Equation with One Predictor Variable

X1 has the largest correlation with the criterion variable, p < .05. The predictor variable
X1 is the first predictor to be entered into the regression equation.

• Model Summary

Examine the R Square.

What is the proportion of the variation in the criterion variable Y explained by the
regression model with one predictor X1?

Answer:
About 74 % of the variation in the criterion variable Y can be explained by the regression
model with one predictor X1.

• Coefficients.

Examine the B and Beta Weights.

Develop a regression equation which contains the predictor X1.

1. Regression equation in obtained scores: ________________________

Answer:

Regression equation in obtained scores: Y' = .92X1 + .975

2. Regression equation in standard scores: ________________________

Answer:

Regression equation in standard scores: Zy' = .858ZX1

• Test the regression coefficient of the entered variable (X1).

Is the regression coefficient associated with X1 significantly different from zero?


What is the t ratio and the observed significance level? Is it significant at the .05
level?

Answer:
The t value is 4.729. The observed significance level associated with X1 is .001.
The regression coefficient associated with X1 is significantly different from zero.

Examine the Excluded Variables

Examine the absolute values of the partial correlations for variables not in the equation.

Model 1: This is the first step of the stepwise regression in which only one predictor, X1,
is used to predict Y. Recall that X1 has the largest correlation with the criterion variable.

The two predictor variables, X2 and X3, are excluded from model 1.

Beta In: These are the standardized regression coefficients for each predictor, should
they be added to the regression equation. The beta value associated with X2 is larger
(.535). It indicates the predictor X2 would make the greater contribution of the two
excluded predictors.

Partial Correlation:

The partial correlation between X2 and Y is .849 after the effect of X1 was removed from
both X2 and Y. The observed significance level associated with X2 is .004, which passes
the entry requirement (p < .05).

The partial correlation between X3 and Y is .452 after the effect of X1 was removed from
both X3 and Y. The observed significance level associated with X3 is .222, which does
not pass the entry requirement (p > .05).

Decision

The predictor variable X2 has the largest partial correlation. The observed significance
level associated with X2 is .004, which passes the entry requirement (p < .05). The
second predictor variable to be entered into the equation will be X2.
Model 2: Regression Equation with Two Predictor Variables

• Examine the R square.

What is the proportion of the variation in the criterion variable Y explained by the
regression model with two predictors X1 and X2?

About 93% of the variation in the criterion variable Y can be explained by the regression
model with two predictors, X1 and X2.

The adjusted corrected for the number of predictors equals .905. The difference
between the obtained and adjusted R square is small in our case.

R square may be overestimated when the data sets have few cases (n) relative
to number of predictors (k). Adjusted R square can be computed as

n = sample size and k = number of predictors

Data sets with few cases relative to number of predictors will have a
greater difference between the obtained and adjusted R square.

• Test of Significance

Test of
Is the regression model with two predictors (X1 and X2) significantly related to the
criterion variable Y?

The regression model with two predictors (X1 and X2) is significantly related to the
criterion variable Y, F(2,7) = 44,073, p < .01.

X1 and X2 account for about 93% of the variance in the criterion variable Y and that this
finding is statistically significant.

• Examine the R Square Change.

(1) About 74 % of the variation in the criterion variable Y can be explained by the
regression model with one predictor X1.

(2) About 93% of the variation in the criterion variable Y can be explained by the
regression model with two predictors, X1 and X2.

(3) An additional 19% of the variance in the criterion variable Y is contributed by X2.
Coefficients

Examine the B and Beta Weights.

Develop a regression equation that contains the above two predictors.

1. Regression equation in obtained scores: ________________________

Regression equation in obtained scores: Y' = .587X1 + .506X2 + .112

Note that the value of B weight associated with each predictor is influenced by all other
predictors in the regression equation.

2. Regression equation in standard scores: ________________________


Regression equation in standard scores: Zy' = .548ZX1 + .535ZX2

3. Part Correlation

The predictor X1 entered the regression equation first and the predictor X2 entered the
regression equation next. Recall that an additional 19% of the variance in the criterion
variable Y is contributed by X2. The signed square root of the R square change is called
the semi-partial correlation or the part correlation. The semi-partial or part correlation
between X2 and Y after removing the effect of X1 from X2 is .436.

• Test the regression coefficients of the entered variables. Both are significantly
different from zero.

Examine the Variables Already in the Equation for Removal

No variables meet the removal criterion.


Examine the Statistics for the Variable not in the Equation for Entry

This is the second step of the stepwise regression in which two predictors, X1 and X2, are
used to predict Y and the predictor variable, X3, is excluded from model 2. Note that the
observed significance level associated with X3 is .158, which is too large for entry (p > .
05).

Decision: X3 will not be included. The best regression equation will be the equation that
contains two predictor variables, X1 and X2.

Reason: Since predictor variables X2 and X3 are highly correlated (r = .804), X3 adds
relatively little in prediction when X2 is in the regression equation.

Reading

• Multiple Regression by Palgrave Maxmillan

Discussion

• What are some of the problems with stepwise regression?

Вам также может понравиться