I have data that measures the temperature of flowers and leaves on different plant species. These are the comparisons I am trying to make:
- Comparing the flower and leaf of the same plant at the same time of day
- Comparing the flower or leaf of a plant to the same flower or leaf at different times of day - on the same plant
My current interpretation is that 1 is an independent statistical test because while they are both on the same individual plant, they are separate "individuals" themselves. For 2 I believe it is a paired test as it is measuring the same flower or leaf at different times of day.
What is confusing me is that I have been taught that if it is the same "individual" then it should be paired, therefore 1 should be a paired test. However my interpretation is that "individual" means something different in the mathematical reality of the tests, where they are seen as different pools in this instance.
Am I correct in my thought? Or completely wrong?
Thank you
Edit: Maybe some clarification to my thought: The difference between my 2 tests is that one uses the same repeated leaf or flower (referring to case 2) at different times of day (therefore it is paired as it is a measurement of the same thing twice at different times). This contrasts with case 1 as this is comparing 2 different objects at the same time of day. The confusion for me comes in when case 2 is still on the same individual - they are related in a way, but I'm unsure if this is relevant to the test.
The experimental unit (presumably sampled at random from a relevant population) is the plant.
Looking at different parts of a plant gives two (paired) values per plant. Looking at different times a day gives two (paired) values per plant.
Both are paired tests.
While both kinds of tests look at averages of the data, the tests are fundamentally different.
Example:
Here are 20 simulated temperature values for 20 plants:
A paired test looks at differences in the values for each plant:
Notice that there is a positive difference $D_I = X_{1i} - X_{2i},$ for $i = 1, 2, \dots, 20,$ for all but one plant. So it is not surprising that a paired test finds a highly significant difference between the two types of meaurement (locations or times) for the plants. Results from R statistical software: P-value almost 0.
By contrast, if we incorrectly run a 2-sample t test on these data, the information inherent in the pairing is lost, and no significant difference is found. The P-value is about 9%.
[As is customary, I have used a Welch 2-sample t test, which does not require equal population variances, but the pooled 2-sample t test gives nearly the same result for these particular data.]
Ordinarily, the two vectors of observations will be positively correlated for paired data, that is the case here:
Data are positively correlated and on a plot all but one point lies above the 45-degree line.