Consider the following experimental setting: I have two machines $m_0$ and $m_1$ of which I would like to know which one performs better. For this I have set up an experiment to measure the time it takes a machine to perform a certain task. I have run the same experiment 100 times and have logged the time measurements, e.g. $t_0 = [0.1, 0.3, 0.2,...]$ for machine $m_0$ and $t_1 = [0.4, 0.1, 0.2]$ for machine $m_1$.
I learned in statistics about the $H_0$ and $H_1$ hypothesis and statistical significance, but somehow the hypotheses there were always given (as something the researcher assumed about the real world a priori) and I am actually not really able to apply what I learned there to my setting.
How can I check that $m_1$ is performing -in statistical terms- significantly better than $m_0$, for instance? What would I choose my hypotheses like? How would I perform the hypothesis test?
This is a good case for a nonparametric Wilcoxon Rank-Sum Test . Ill outline what you can do: