My data is not Normally distributed

67 Views Asked by At

I'm working on data and my goal is analyze it with MANOVA.

My list question is :

  1. Is MANOVA needs data with a condition to be normally distributed?
  2. What should i do if my data is not normally distributed without deleting the outlines and except transformation?
  3. What if i'm using Log Transformation to transform my data and my data is still Not Normal? Should i leave it? And how about my MANOVA? So, my data can't be analyze with MANOVA?
  4. Could it be fine if i'm ignoring about my data that is not normally distributed and keeping analyze the MANOVA?

Thanks in advance.

1

There are 1 best solutions below

4
On BEST ANSWER

One of the assumptions of MANOVA is that the response is normally distributed within groups, so depending on how far from normal your data is you might not be able to trust the conclusions of your analysis.

One approach to normalising your data is to use a scaled power transformation family: $$ \phi(Y; \lambda) = \begin{cases} \frac{Y^{\lambda}- 1}{\lambda} \quad \text{if } \lambda \neq 0,\\ \log Y \quad \text{if } \lambda = 0. \end{cases} $$ for $Y>0$. So how do we fine the best $\lambda$ for our dataset? There are built in packages in R that do this for you, the following is an example where I generate toy data (skewed y) and use the InvResPlot function to find the optimal $\lambda$. I give the IRP function a range of lambdas to optimise over:

# toy data
x = seq(1, 3, length.out = 200)
y = 2*x^3 + 0.6 * rnorm(length(x), 0, 1)

require(alr3)
par(mfrow=c(2,2))
IRP <- invResPlot(lm(y~x), lambda = c(-1, -1/2, -1/3, -1/4, 0, 1/4, 1/3, 1/2, 1))
hist(y, main="No transform")
hist(log(y), main="log transform")
hist((y^(0.37)-1)/0.37, main="optimal scale family")

enter image description here

Another approach is the Box-Cox transformation, which you can read about here