# Stepwise regression

## Choose the best combination of variables to predict for a continuous outcome

Stepwise regression is a regression technique that uses an algorithm to select the best grouping of predictor variables that account for the most variance in the outcome (R-squared). Stepwise regression is useful in an exploratory fashion or when testing for associations. Stepwise regression is used to generate incremental validity evidence in psychometrics. The primary goal of stepwise regression is to build the best model, given the predictor variables you want to test, that accounts for the most variance in the outcome variable (R-squared).

### The steps for conducting stepwise regression in SPSS

1. The data is entered in a mixed fashion.

2. Click

3. Drag the cursor over the

4. Click

5. Click on the continuous outcome variable to highlight it.

6. Click on the

7. Click on the first predictor variable to highlight it.

8. Click on the

9. Repeat Steps 7 and 8 until all of the predictor variables are in the

10. Click on the

11. Click on the

12. Click on the

13. Click on the DEPENDNT variable to highlight it.

14. Click on the

15. Click on the *ZRESID variable to highlight it.

16. Click on the

17. In the

18. Click

19. Click on the

20. Click on

21. Click

2. Click

**.**__A__nalyze3. Drag the cursor over the

**drop-down menu.**__R__egression4. Click

**.**__L__inear5. Click on the continuous outcome variable to highlight it.

6. Click on the

**arrow**to move the variable into the**box.**__D__ependent:7. Click on the first predictor variable to highlight it.

8. Click on the

**arrow**to move the variable into the**box.**__I__ndependent(s):9. Repeat Steps 7 and 8 until all of the predictor variables are in the

**box.**__I__ndependent(s):10. Click on the

**button.**__S__tatistics11. Click on the

**R**,__s__quared change**Co**,__l__linearity diagnostics**D**, and__u__rbin-Watson**boxes to select them.**__C__asewise diagnostics12. Click on the

**Plo**button.__t__s13. Click on the DEPENDNT variable to highlight it.

14. Click on the

**arrow**to move the variable into the**box.**__X__:15. Click on the *ZRESID variable to highlight it.

16. Click on the

**arrow**to move the variable into the**box.**__Y__:17. In the

**Standardized Residual Plots**table, click on the**and**__H__istogram**No**boxes to select them.__r__mal probability plot18. Click

**Continue**.19. Click on the

**drop-down menu.**__M__ethod:20. Click on

**Stepwise**.21. Click

**OK**.### The steps for interpreting the SPSS output for stepwise regression

1. Look in the

The

If the

If the

2. Look in the

The

The

The

The

If a

If a

The

If any of the Tolerance values are

**Model Summary**table, under the**R Square**and the**Sig. F Change**columns. These are the values that are interpreted.The

**R Square**value is the amount of variance in the outcome that is accounted for by the predictor variables.If the

*p*-value is**LESS THAN .05**, the model has accounted for a statistically significant amount of variance in the outcome.If the

*p*-value is**MORE THAN .05**, the model has not accounted for a significant amount of the outcome.2. Look in the

**Coefficients**table, under the**B**,**Std. Error**,**Beta**,**Sig.**, and**Tolerance**columns.The

**B**column contains the unstandardized beta coefficients that depict the magnitude and direction of the effect on the outcome variable.The

**Std. Error**contains the error values associated with the unstandardized beta coefficients.The

**Beta**column presents unstandardized beta coefficients for each predictor variable.The

**Sig.**column shows the*p*-value associated with each predictor variable.If a

*p*-value is**LESS THAN .05**, then that variable has a significant association with the outcome variable.If a

*p*-value is**MORE THAN .05**, then that variable does not have a significant association with the outcome variable.The

**Tolerance**column presents values related to assessing multicollinearity among the predictor variables.If any of the Tolerance values are

**BELOW .75**, consider creating a new variable or deleting one of the predictor variables.### Residuals

At this point, researchers need to construct and interpret several plots of the raw and standardized residuals to fully assess model fit. Residuals can be thought of as

**the error associated with predicting or estimating outcomes using predictor variables**. Residual analysis is**extremely important**for meeting the linearity, normality, and homogeneity of variance assumptions of statistical multiple regression.Scroll down the bottom of the SPSS output to the

**Scatterplot**. If the plot is linear, then researchers can assume linearity.### Outliers

Normality and equal variance assumptions also apply to multiple regression analyses.

Look at the

**P-P Plot of Regression Standardized Residual**graph. If there are not significant deviations of residuals from the line and the line is not curved, then normality and homogeneity of variance can be assumed.### Incremental validity is established with stepwise regression

Incremental validity is a type of psychometric evidence generated by incremental validity. Click on the

**Incremental Validity**button to learn more.## Hire A Statistician - Statistical Consulting for Professionals

**DO YOU NEED TO HIRE A STATISTICIAN?**

Eric Heidel, Ph.D. will provide statistical consulting for researchers, professionals, and organizations at $100/hour. Secure checkout is available with Stripe, Venmo, Zelle, or PayPal.

- Statistical Analysis
- Research Design
- Sample Size Calculations
- Diagnostic Testing and Epidemiological Calculations
- Survey Design and Psychometrics