I'm a software engineer, currently trying to solve a problem that can be modelled by the allocation of a known number of balls $n$ to a known number of bins $r$ ($0\leq n \leq 1000000$, $1 \leq r \leq 100$). The difficulty I'm having is that:
- Each bin has an associated cost per ball, and the total cost must be kept below some known value
- I need to maximise the 'spread' of the balls across bins as much as possible.
My original idea was to formulate the problem as an integer programming problem with
- constraints encoding the maximum overall cost and the number of balls
- an objective function of:
$$ \sum^r_{i=0}a\ln(x_i+1) $$
where $x_i$ is the number of balls in the bin $i$ and $a$ is some constant chosen to keep numbers reasonable. The intention here is that a log function, with its decreasing gradient, 'rewards' each new ball in a bin less, and therefore encourages the algorithm to add balls more to emptier bins. This function isn't set in stone, though - anything with similar properties will do!
The difficulty I'm having is that I can't figure out how to solve this problem. Obviously, there is theoretically a polynomial-time solution (a trivial example being "enumerate all possible points") but this is intractable in the real world as I'm working in tens of sizable dimensions.
My most recent thinking has been that a log function is easily differentiable, so something like a hill-climbing approach might work, but I'm unsure how to apply that to an integer problem space, and I'm running out of ideas on what to Google. I've also already come across this question, which seems to be asking about a similar problem, but I'm not sure it's that applicable to my situation.
TIA!
So I've written a solution based on @Karl's first comment where I generate a uniform solution and evenly reallocate balls from above-cost to below-cost bins, and it takes ~$40$ms for a problem size of ~$74000$ balls and $13$ bins, which is perfectly acceptable for my needs!
The solution this generates is essentially:
I'm satisfied with this solution, so I'm going to mark this message as the answer for my particular question (unless any even smarter ideas appear).