I have a very large dataset containing horse racing results. What's intriguing to me is the variance across the time taken across the various race tracks. It has led me to look into causal factors.
The one that I'm completely stumped on is how to interpret each result. For example, let's say that we have a result for Horse A;
Distance: 1100 yards
Track: Ascot
Track conditions: Soft
Class of race: 5
Time Taken: 71.02 seconds
Now, we might assume we can run a lookup of the 71.02 seconds for previous race conditions so we are comparing like-for-like. That is fair enough, I think.
However, let's assume that Horse A is next running at a new track with the same conditions, but the topography is wholly different. Is there a way I can build a catalog of averages/standards, etc. that would allow me to make a meaningful inference from the 71.02 in Horse A's previous run?