# Transformed outcomes

## Some continuous variables will be naturally skewed

**length of stay (LOS)**in the hospital. When thinking about the distribution of a variable such as LOS, you have to put it into a relative context. The vast majority of people will have an LOS of between 0-3 days given the type of treatment or injury that brought them to hospital. VERY FEW individuals will stay at the hospital one month, six months, or a year. Therefore, the distribution looks nothing like the normal curve and is extremely positively skewed.

As a researcher, you may want to predict for a continuous variable that has a

**natural and logical skewness**to its distribution in the population. Yet, the assumption of normality is a central tenet of running statistical analyses. What is one to do in this situation?

The answer is to first, run

__and__

**skewness**__statistics to assess the normality of your continuous outcome. If the either statistic is above an absolute value of 2.0, then the distribution is non-normal. Check for__

**kurtosis****outliers**in the distribution that are more than 3.29 standard deviations away from the mean. Make sure that the outlying observations were entered correctly.

You now have a choice:

1.

**You can delete the outlying observations in a listwise fashion**. This should be done only if the number of outlying variables is less than 10% of the overall distribution. This is the least preferable choice.

2.

**You can conduct a logarithmic transformation on the outcome variable**. Doing this will normalize the distribution so that you can run the analysis using parametric statistics. The unstandardized beta coefficients, standard errors, and standardized beta coefficients are not interpretable, but the significance of the associations between the predictor variables and the transformed outcome can yield some inferential evidence.

3.

**You can recode the continuous outcome variable into a lower level scale of measurement such as ordinal or categorical and run non-parametric statistics to seek out any associations**. Of course, you are losing the precision and accuracy of continuous-level measurement and introducing measurement error into the outcome variable, but you will still be able to run inferential statistics.

4.

**You can use non-parametric statistics without changing the skewed variable at all**. That is one of the primary benefits of non-parametric statistics: They are robust to violations of normality and homogeneity of variance. Instead of interpreting means and standard deviations, you will interpret medians and interquartile ranges with non-parametric statistics.

Click on the

**Statistics**button to learn more.