Optimizing a set of rules to better predict the outcome of events

19 Views Asked by At

I'm trying to better predict the top three finishers of the next 1000 800m mens freestyle swimming race.

I've got a set of rules to rate the swimmers:

1) Add 5 points if the swimmer won his last race

2) Subtract 3 points if the swimmer is less than 6' tall

3) Add 10 points if the swimmer won his last two races

4) Add 3 points if the swimmer wears a full body swimming suit

5) Subtract 10 points if the swimmer's last race time was over 8 minutes

I want to optimize my set of rules so that the swimmer I rate the highest finishes first more often, the swimmer with the second highest rating finishes second (more often), and the swimmer with the third highest rating finishes third.

I can back-test over the last 5000 swimming races to optimize the rules.

For example, for a race on Jan1 2013 I have the following results:

swimmerID | rule1 | rule2 | rule3 | rule4 | rule5 | finalRating | finishPosition

Swimmer1 | 5 | -3 | 0 | 3 | 0 | 5 | 1

Swimmer2 | 5 | 0 | 10 | 0 | 0 | 15 | 4

Swimmer3 | 5 | 0 | 0 | 3 | 3 | 11 | 3

Swimmer4 | 5 | 0 | 0 | 0 | 3 | 8 | 2

Swimmer5 | 5 | -3 | 10 | 3 | 3 | 18 | 6

Swimmer6 | 5 | -3 | 0 | 0 | 0 | 2 | 5

I can't figure out how to model this problem. I originally thought it was a simple lp problem but I couldn't get that to work.