Help on selecting 2 tests - unsure if they are paired or non-paired statistical test

48 Views Asked by At

I have data that measures the temperature of flowers and leaves on different plant species. These are the comparisons I am trying to make:

  1. Comparing the flower and leaf of the same plant at the same time of day
  2. Comparing the flower or leaf of a plant to the same flower or leaf at different times of day - on the same plant

My current interpretation is that 1 is an independent statistical test because while they are both on the same individual plant, they are separate "individuals" themselves. For 2 I believe it is a paired test as it is measuring the same flower or leaf at different times of day.

What is confusing me is that I have been taught that if it is the same "individual" then it should be paired, therefore 1 should be a paired test. However my interpretation is that "individual" means something different in the mathematical reality of the tests, where they are seen as different pools in this instance.

Am I correct in my thought? Or completely wrong?

Thank you

Edit: Maybe some clarification to my thought: The difference between my 2 tests is that one uses the same repeated leaf or flower (referring to case 2) at different times of day (therefore it is paired as it is a measurement of the same thing twice at different times). This contrasts with case 1 as this is comparing 2 different objects at the same time of day. The confusion for me comes in when case 2 is still on the same individual - they are related in a way, but I'm unsure if this is relevant to the test.

1

There are 1 best solutions below

1
On BEST ANSWER

The experimental unit (presumably sampled at random from a relevant population) is the plant.

Looking at different parts of a plant gives two (paired) values per plant. Looking at different times a day gives two (paired) values per plant.

Both are paired tests.

While both kinds of tests look at averages of the data, the tests are fundamentally different.

Example:

Here are 20 simulated temperature values for 20 plants:

X1:
43 48 33 35 36 38 37 47 35 44 31 37 36 44 46 37 36 40 37 46
X2:
45 48 33 38 37 39 43 46 37 49 37 39 41 46 48 42 39 45 39 48

A paired test looks at differences in the values for each plant:

D:
2  0  0  3  1  1  6 -1  2  5  6  2  5  2  2  5  3  5  2  2

enter image description here

Notice that there is a positive difference $D_I = X_{1i} - X_{2i},$ for $i = 1, 2, \dots, 20,$ for all but one plant. So it is not surprising that a paired test finds a highly significant difference between the two types of meaurement (locations or times) for the plants. Results from R statistical software: P-value almost 0.

t.test(X2, X1, pair=T)

        Paired t-test

data:  X2 and X1
t = 5.7558, df = 19, p-value = 1.513e-05
alternative hypothesis: 
   true difference in means is not equal to 0
95 percent confidence interval:
 1.686359 3.613641
sample estimates:
mean of the differences 
                   2.65 

By contrast, if we incorrectly run a 2-sample t test on these data, the information inherent in the pairing is lost, and no significant difference is found. The P-value is about 9%.

enter image description here

[As is customary, I have used a Welch 2-sample t test, which does not require equal population variances, but the pooled 2-sample t test gives nearly the same result for these particular data.]

t.test(X2, X1)

    Welch Two Sample t-test

data:  X2 and X1
t = 1.7208, df = 37.79, p-value = 0.09346
alternative hypothesis: 
   true difference in means is not equal to 0
95 percent confidence interval:
 -0.4680369  5.7680369
sample estimates:
mean of x mean of y 
    41.95     39.30 

Ordinarily, the two vectors of observations will be positively correlated for paired data, that is the case here:

cor(X1, X2)
[1] 0.9131605

Data are positively correlated and on a plot all but one point lies above the 45-degree line.

enter image description here