Collective wisdom is desperately needed! I need to understand if some kind of significance testing is applicable here, and if that is the case - which test.
The data collection was devised as follows: There are 3 stations where people perform a number of station-specific manual tasks.
On each of three test days different people were assigned to stations (9 people in total). people were of different height, age and weight.
Each person (at each station on each day) was subjected to 5000 measurements of his kinetic activity, later classified as "good" or "bad". The percentages of "good" and "bad" for each person (and day and station) are thus known.
3 people from day one are treated as a "control group", or baseline.
I would like to know if a significance test is possible for percentages @ station[x] and day one (and baseline person working at that station on day one) and percentages @ same station for a different day/person.
say station 1 was a "drill and rotate". On day one, John worked there alone and had 80% good drill-and-rotate movements and 20% bad ones (out of 5000 measured) On day two, Jill worked there, had ongoing egonomic instructions and had 85% and 15% respectively.
Null hypothesis is (I am guessing) "the difference can be attributed to natural variation in humans only" I want to test the significance of "had ergonomic instructions".
Can it be done with such a setup and which formula? Or is it a faulty experiment design and such testing is impossible?
Please help. I am no statistician and I am at my wits' end.

Not nearly enough information for a definitive answer:
Days. Are there designed differences among the three Days (that carry across stations)? [You hint that John (Day 1) was somehow treated differently than Jill (Day 2). What about Day 3?]
Stations. Are there designed differences among the three Stations (that carry across days)? [You hint that this may be so. Station 1 is 'drill and rotate'. What about Stations 2 and 3?] ]
Tentative model. From what (little) I can gather from your description, I guess that this may be a two-factor ANOVA (3 Stations $\times$ 3 Days.) with one observation per 'cell'. The model might be $$Y_{ij} + \mu + \alpha_i + \beta_j + e_{ij},$$ for $i = 1,2,3$ days and $j = 1,2,3$ stations, where independently $e_{ij} \sim \mathsf{Norm}(0, \sigma).$
The ANOVA table would have three rows: Days, Stations, Error. Because there is only one observation per cell, there would be no interaction term.
Data. Data would be fractions Good out of 5000. Although technically binomial proportions, i guess my first try at analysis would be to treat data as normal (as presaged by the tentative model) because the number of trials per subject is so large. I would certainly want to check whether the nine residuals seem consistent with normal.
Contrast. It seems that Day 1 ('treated as a control') might be treated differently from Days 2 and 3. If that is so, and the Day-effect is significant, you might want to test whether the designed contrast (comparing Day 1 against the other two), based on the coefficient vector $c=(1, -.5, -.5)$ is significant.
Design flaws. Without seeing the data or understanding the nature of the effects under study, I'm guessing that it would have been better to have five times as many subjects, each doing 1000 trials. Unless effects are profoundly large or variability much smaller than is usual using human subjects. I suppose that one subject in each of the $3 \times 3$ cells will not provide enough power to find significant differences. (Even twice as many subjects, 2500 trials, 2 observations per cell would have been a lot better.)
Are any apparent differences among subjects due to personal differences, or are they due to different performances of the same subject across time? With this design we'll never know. Did anyone stop to think that three randomly chosen subjects are hardly enough for a 'baseline' of any kind?
The time to think about analyzing data from a study is before the study is done. It is regrettable if so much effort has been expended with no clear model or strategy for analysis in mind.