Statistical Package for the Social Sciences (SPSS; Armonk, NY, IBM Corp.) is a statistical software application that allows for researchers to enter and manipulate data and conduct various statistical analyses. Step by step methods for conducting and interpreting over 60 statistical tests are available in Research Engineer. Videos will be coming soon. Click on a link below to gain access to the methods for conducting and interpreting the statistical analysis in SPSS.
1 Comment
Parametric statistics are more powerful statisticsNonparametric statistics are used with categorical and ordinal outcomes
As we continue our journey to break through the barriers associated with statistical lexicons, here is another dichotomy of popular statistical terms that are spoken commonly but not always understood by everyone.
Parametric statistics are used to assess differences and effects for continuous outcomes. These statistical tests include onesample ttests, independent samples ttests, oneway ANOVA, repeatedmeasures ANOVA, ANCOVA, factorial ANOVA, multiple regression, MANOVA, and MANCOVA. Nonparametric statistics are used to assess differences and effects for: 1. Ordinal outcomes  Onesample median tests, MannWhitney U, Wilcoxon, KruskalWallis, Friedman's ANOVA, Proportional odds regression 2. Categorical outcomes  Chisquare, Chisquare Goodnessoffit, odds ratio, relative risk, McNemar's, Cochran's Q, KaplanMeier, logrank test, CochranMantelHaenszel, Cox regression, logistic regression, multinomial logistic regression 3. Small sample sizes (n < 30)  Smaller sample sizes make it harder to meet the statistical assumptions associated with parametric statistics. Nonparametric statistics can generate valid statistical inferences in these situations. 4. Violations of statistical assumptions for parametric tests  Normality, Homogeneity of variance, Normality of difference scores Logistic regression yields adjusted odds ratiosAdjusted odds ratios are easier generalized to clinical situations
There is a strong need in clinical medicine for adjusted odds ratios with 95% confidence intervals. Medicine, as a science, often uses categorical outcomes to research causal effects. It is important to assess clinical outcomes (measured at the dichotomous categorical level) within the context of various predictor, clinical, prognostic, demographic, and confounding variables. Logistic regression is the statistical method used to understand the associations between the aforementioned variables and dichotomous categorical outcomes.
Logistic regression yields adjusted odds ratios with 95% confidence intervals, rather than the more prevalent unadjusted odds ratios used in 2x2 tables. The odds ratios in logistic regression are "adjusted" because their associations to the dichotomous categorical outcome are "controlled for" or "adjusted" by the other variables in the model. The 95% confidence interval is used as the primary inference with adjusted odds ratios, just like with unadjusted odds ratios. If the 95% confidence interval crosses over 1.0, then there is a nonsignificant association with the outcome variable. Adjusted odds ratios are important in medicine because very few physiological or medical phenomena are bivariate in nature. Most disease states or physiological disorders are understood and detected within the context of many different factors or variables. Therefore, to truly understand treatment effects and clinical phenomena, multivariate adjustment must occur to properly account for clinical, prognostic, demographic, and confounding variables. Multivariate statistical tests show evidence of association between predictor variables and an outcome, when controlling for demographic, confounding, and other patient data.Multivariate statistics are more reflective of realworld medicine
We covered betweensubjects and withinsubjects analyses in the first Statistical Designs post. Multivariate statistics will be the focus in Statistical Designs 2.
While 90% of statistics reported in the literature fall under the guise of betweensubjects and withinsubjects analyses, they do not properly account for all of the variance and confounding effects that exist in reality. Multivariate statistics play an important role in empirical reasoning because they allow us to control for various demographic, confounding, clinical, or prognostic variables that mitigate, mediate, and affect the association between a predictor and outcome variable. They are also much more representative of reality and true effects that exist within human populations. Very few if any relationships or treatment effects in physiology, psychology, education, or life in general are bivariate in nature. Relationships and treatment effects in reality ARE multivariate, diverse, and confounded by any number of characteristics. Therefore, it makes sense that researchers should be conducting multivariate statistics to truly understand human phenomena. With this being said, it is important to use multivariate statistics ONLY when you are asking a multivariate research question. Throwing a bunch of variables into a model without some sort of theoretical or conceptual reason for including them can yield false treatment effects and increase Type I errors. Also, these spurious variables can create "statistical noise" which detracts from a model's capability for detecting significant associations. Choosing the correct multivariate statistic to answer your question is simple. You choose the multivariate analysis based on the outcome. 1. Categorical outcomes  Logistic regression (dichotomous), multinomial logistic regression (polychotomous), KaplanMeier, CochranMantelHaenszel, Cox regression (dichotomous/survival/timetoevent) 2. Ordinal outcomes  Proportional odds regression 3. Continuous outcomes  Factorial ANOVA with fixed effects, factorial ANOVA with random effects, factorial ANOVA with mixed effects, ANCOVA, multiple regression, MANOVA, MANCOVA 4. Count outcomes  Negative binomial regression (variance larger than mean) and Poisson regression (mean larger than variance) Ordinal measures and normalityOrdinal level measurement can become interval level with assumed normality
Here is an interesting trick I picked up along the way when it comes to ordinal outcomes and some unvalidated measures. If you run skewness and kurtosis statistics on the ordinal variable and its distribution meets the assumption of normality (skewness and kurtosis statistics are less than an absolute value of 2.0), then you can "upgrade" the variable to a continuous level of measurement and analyze it using more powerful parametric statistics.
This type of thinking is the reason that the SAT, ACT, GRE, MCAT, LSAT, and validated psychological instruments are perceived at a continuous level. The scores yielded from these instruments, by definition, are not continuous because a "true zero" does not exist. Scores from these tests are often norm or criterionreferenced to the population so that they can be interpreted in the correct context. Therefore, with the subjectivity and measurement error associated with classical test theory and item response theory, the scores are actually ordinal. With that being said, if the survey instrument or ordinal outcome is used in the empirical literature often and it meets the assumption of normality as per skewness and kurtosis statistics, treat the ordinal variable as a continuous variable and run analyses using parametric statistics (ttests, ANOVA, regression) versus nonparametric statistics (Chisquare, MannWhitney U, KruskalWallis, McNemar's, Wicoxon, Friedman's ANOVA, logistic regression). 
Archives
March 2016
AuthorEric Heidel, Ph.D. is Owner and Operator of Scalë, LLC. Categories
All
