I think I may have uncovered a corrupt box draw practice in greyhound racing. Can someone please help me find the truth using math?
It is rumored that one trainer is being given preferential box draw treatment for greyhound racing. Does the following numbers fall outside the accepted deviation of what would/ could/ should happen in a "blind" box draw.
There are $8$ runners in every race, box $-1-$ has a massive advantage in greyhound racing, and box $-5-$ a massive disadvantage
One trainer draws box $-1-$ $232$ from $1621$ runners and box $-5-$ $163$ from $1621$. Is he being given preferential treatment, does the statistics prove that this is not a random draw?
Another trainer for example draws box $-1-$ $ 272$ from $2183$ and box $-5-$
$309$ from $2183$ starters, is he being given a disadvantage?
I would appreciate any help from someone who knows about this type of randomness,probability, statistics, or whatever expert field this is.Is there a pattern or proof of something not quite right?
Or is all this just the way the cookie crumbles?
If boxes 1 and 5 are, in theory, equally likely, a simple test is whether the total number of draws of those two boxes is equitably partitioned into 1's and 5's. In your first example there are nearly 400 draws of 1-or-5, and one standard deviation of the total number of "1" draws in this pool (equivalent to tossing a fair coin 400 times and counting how many Heads appear) is very close to 10. The split of 232 vs 163 is a 3-standard-deviation event, about 1 in 700 probability.
However, you became suspicious of this split by looking at the luckiest trainer. So in fact we have a process where results are generated for $T \geq 8$ trainers, and the luckiest one is singled out for analysis. The probability is about $T$ times higher of seeing any particular rare outcome that way. If there were $8$ trainers, the probability would be above 1 percent and maybe somewhat higher, like 1.5 or 2 percent, if computations were done carefully without the approximations that I am using here. This is at the border of plausible randomness and suspicion; the chances are appreciable for the observed level of imbalance between tracks 1 and 5 to have occurred for at least one of the trainers, but probabilities on the order of 1 percent could signal something unusual, especially where there is an incentive to cheat.
If things are simulated accurately the results for the different trainers are not independent of each other, which will affect the numerical probabilities somewhat but not qualitatively change things. The main issue seems to be how many trainers there are, and how improbable an outcome you would consider as sufficient to trigger suspicion.