Association of Categorical and Qualitative Variables

27 Views Asked by At

I am trying to look at the association between the cost of goods depending on the area. Group A bought their product in one area and group B bought theirs in another. Group A is larger but group B is small ($n_B \leq$ 20) in comparison. The box plot of both shows group A's cost is clustered around a larger value than group B's. Cost in group A is distributed bimodally while cost is distributed unimodally in group B- which I verified by density plot. Group B's distribution is non-normal as well with a different variance than group A. I was wondering if there was another coefficient or test that might be applicable to this situation to claim an association between group membership/location and cost. Normally I would try a non-parametric test but most that I recall seem to have limits on variances, similarity of distributions, etc, which this data violates.

1

There are 1 best solutions below

0
On

In order to compare two distributions, we must not only look at whether or not the distributions of both curves are normal, but whether or not they are independent and represent the population.

SRS or a simple random sample needs to be specified and how you went about doing it. I would take a sample size that is at least 30, from both sources. If successful, then the expected value from each distribution/area is more accurate overall.

Now that conditions are looked at and the normality of each distribution is not fully met, we will proceed with caution. Since the data of group B is unimodal but not normal, this may or may not imply there is skewness and/or outliers. If there is strong skewness and outliers in the data, then a two-sample t test is not a valid option. Becuase the data violates a lot of conditions, I would use a non-parametric test. Specifically the Wilcoxon Signed-rank test.

I have attached how to do the test

Hope this helps