I found the video "YAHOO Interview Puzzle || Camel and Bananas || Logic + Optimization" on the LOGICALLY YOURS YouTube channel.
Here's the essence of it:
- you must transport bananas from an initial location to a market 1000 km away
- you can only use a camel for transporting them
- the camel requires (= consumes) 1 banana for each km it travels, and can carry at most 1000 bananas at any one time
- if you have 3000 bananas to start with, at the initial location, what is the max number of bananas you can eventually get to the market?
In the video, they address it using a lot of heuristics / not much explanation, which is the part that I am trying to figure out / reduce to a more systematic or formal approach.
Here is their approach:
You need to transport the bananas using 2 intermediate stops between the initial location and the market (proof?). Starting a forward journey with fewer than 1000 bananas is wasteful (proof?). Therefore, 5 trips are necessary between the initial storage and the first stop (proof?), and 2000 bananas must be at the first stop when the initial storage is empty, which implies the first stop must be at 200 km (this they actually do prove). Therefore, 3 trips are necessary between the first stop and the second stop (proof?), and 1000 bananas must be at the second stop when the first stop is empty, which implies the second stop must be at 533 km, and the final answer is $1000 - (1000 - 533) = 533$ bananas at the market (this is actually wrong under the assumption that no 'fractional' bananas can be consumed, as it would leave 1 banana at the second stop; the correct answer is 534 km, leaving 532 bananas at the market).
I've been trying to frame this problem in several ways, e.g. as a linear program (see below), but I keep circling back to having to use the assumptions or heuristics they used in the video.
Is it possible that one cannot solve this without making all these 'guesses'? Is there a more 'formal' way of tackling this? And especially is this problem a 'known' one in mathematics, with an 'official' solution or approach that guarantees its optimality?
PS : describing here my (failed) attempts to solve this without guessing.
One interesting aspect of this problem, I find, is that without any intermediate stops one would have to travel 5000 km to empty the initial storage (and it couldn't even be done, as one would be stuck without fuel after the first forward trip of 1000 km with 1000 bananas).
The use of intermediate stops not only allows to carry at least some bananas to the destination, but also drastically reduces the total distance travelled.
This made me realize that in fact this is also a minimization of the total distance travelled, because the number of bananas left at the end is 3000 minus that.
So my first naive attempt (under the assumption that 5 trips were required to carry all the bananas to the first stop) was to look for a value of $x$ (distance of the first stop from the start) that maximized the product of the forward distance reached by the number of bananas left:
$x \cdot (3000 - 5 \cdot x)$
This turns out to be $x = 300$, but leaves only 1500 bananas at the first stop.
Repeating the same approach for $y$ (distance of the second stop from the first stop), i.e. maximizing $y \cdot (1500 - 3 \cdot y)$, results in $y = 250$ and 750 bananas left.
450 km still need to be covered in the last forward trip, leaving 300 bananas overall.
Much fewer than the 532 they got by their heuristic method.
My second idea was to find $x$ such that the number of bananas left at the first stop would be maximal, when all the initial stock had been moved.
Obviously this didn't work, as $x = 1$ would be the best solution, leaving 2995 bananas, but then regardless of the value of $y$, the final number of bananas would be much lower than 532.
This made me think that the values of $x$ and $y$ cannot be determined independently (i.e. one cannot solve this problem by finding a 'best' value for $x$, ignoring all the rest, and then doing the same for $y$).
But I am not sure of this conclusion, and I cannot find an expression to link $x$ and $y$.
Unless I am mistaken, the number of trips one needs to make to remove a total of $N_0$ bananas, by taking $n$ bananas at a time, is:
$2 \cdot \lceil \frac {N_0} n \rceil - 1$
As bananas are consumed at each trip, it would seem advantageous to keep $n$ as large as possible.
This seems to prove the video's assumption that the maximal possible number of bananas should be taken on any forward trip, i.e. $n = 1000$, and to imply that $5$ trips (3 forward and 2 backward) must be used to empty the initial storage.
At that point, however, $x$ is not known yet, and it is not clear to me what implies that one should choose $y$ so that only 3 trips are left to make.
If I assume that up to 3 trips must be made (2 forward, 1 backward), transporting $n_1, n_2$ bananas, respectively, in the forward trips from $x$ to $x+y$, and that a single final forward trip needs to be made from $y$ to the market, then for a given choice of $y$:
So this could potentially be seen as a linear program in $x, y, n_1, n_2$:
$argmax \ n1 + n2 - 3 \cdot y - (1000 - x - y)$
with several constraints (I am probably putting some redundant ones in) :
$x \ge 1$
$y \ge 1$
$n_1 \le 1000$
$n_2 \le 1000$
$3000 - 5 \cdot x - n_1 - n_2 = 0$
$n_1 - 2 \cdot y \ge 1$
$n_1 + n_2 - 3 \cdot y \ge 1$
$n_1 + n_2 - 3 \cdot y \le 1000$
This works (solved using R's lpSolve).
It yields $x = 200, y = 334, n_1 = n_2 = 1000$.
But is it the correct approach?
And how do I know that it is best to have only stop $x+y$ between $x$ and the market, rather than 2 or more stops?
If I had a different initial number of bananas, and/or a different maximal carrying capacity, and/or a different total distance, I would be left guessing, as I have developed no 'method' to tackle this.
Any ideas/suggestions?
EDIT based on Paul Sinclair's suggestions
Let:
$m \in \mathbb Z^+$ = maximal # bananas that can be carried at any one time by the camel
$x_0 = 0$ (position of the initial storage)
$x_i \in \mathbb Z^+, \forall i \ge 1$ = distance between stop $i-1$ and stop $i$
$b_0 \in \mathbb Z^+$ = # bananas at the initial storage at the start
$b_i, \forall i \ge 1$ = # bananas left at stop $i$ when stop $i-1$ is empty
$D\in \mathbb Z^+$ = distance of the market from the initial storage
$\sum_{i=1}^k = D$ (meaning that the market is exactly at stop $k$)
Assuming that the initial storage and all intermediate stops before the final destination (market) must be empty at the end, what are the required number of stops $k$, and the corresponding $k$ values of $x_i$, which result in the maximal $x_k$ ?
As per Paul's suggestion (except for the indices):
$b_i = b_{i-1} - (2 \cdot \lceil \frac {b_{i-1}} m \rceil - 1) \cdot x_i$
[editing halted - apparently the hypotheses I am using are incorrect / do not accurately represent the linked Youtube problem]



As you have noted, trying to carry bananas all the way leaves you with $0$ bananas at the market and a camel that won't go back for the others as it has no bananas to eat. So you have to move forward in stages. (how many stages comes out from working the problem. They have mentioned the answer at the start, but that number is not a necessary consideration.)
So make your first stop at some distance $x$.
The fewer bananas the camel carries on a trip, the more trips it has to make to get all $3000$ bananas moved (or eaten). So maximizing the number of bananas carried per trip gives you the fewest eaten bananas.
To move the 3000 bananas will require three trips out, and therefore two trips back to pick up more bananas. Thus five trips total, eating $5x$ bananas. The highest $x$ can be is $600$. Otherwise the camel would run out of bananas
The idea here is that in the stage you want the camel to start off with full loads too. So the amount of bananas at the end of the first stage should be a multiple of $1000$, and since you cannot make it all the way in a second stage either (which you can check by the same logic as shows that going all the way in one trip will not work), a third stage should also be a multiple of $1000$. This can only be if the camel consumes multiples of $1000$ bananas a stage (except the final one), and with only $3000$ bananas total, that will have to be $1000$ bananas consumed for the first two stages.
However, while this sounds reasonable, I do not find it fully credible without closer examination. I would rather leave the first day's journey as some variable distance $x$, then optimize once I have a full expression for the cost.
Accepting that there are $2000$ bananas left at the first stop, the reasoning is the same as before: the camel will need two trips out to carry the $2000$ bananas, and one trip back to pick up the second load.
Not quite. If the camel eats $1000$ bananas for its second stage, then it can move $333\frac 13$ km for the second stage. Putting that stop at $200 + 333\frac 13 = 533\frac13$ km from the initial location, not $533$ km. I see nothing that restricts the stages to an integral number of kilometers. So the final day the camel travels $1000 - 533\frac 13 = 466\frac23$ km and eats that many bananas. Feed it the remaining $\frac 13$ of a banana as a reward, and you are left with $533$ bananas, and a camel to sell, since taking it back home would turn your profit into a loss.
As I stated earlier, the assumption that letting the camel eat exactly $1000$ bananas a stage would result in an optimal solution is a bit too much for me to just take unexamined. Instead I would assume $k$ stages of length $x_1, x_2, \dots, x_k$ with $\sum_{i=1}^k x_k = 1000$. At the start of each stage there are $b_n$ remaining bananas. So $b_1 = 3000$ and $$b_i = b_{i-1} - \left(2\left\lceil \frac {b_{i-1}}{1000}\right\rceil - 1\right)x_{i-1}$$ (where $\lceil t \rceil$ denotes the least integer $\ge t$, and there are $\left\lceil \frac {b_i}{1000}\right\rceil$ outward trips in stage $i$ with one fewer trips back). So $$\frac{\partial b_i}{\partial x_{i-1}} = 2\left\lceil \frac {b_{i-1}}{1000}\right\rceil.$$ Also for $j \ge i, \frac{\partial b_i}{\partial x_j} = 0$ since $b_i$ is determined by the distances travelled before the $i$-th stage, not after. For $j < i -1, \frac{\partial b_i}{\partial x_j} = \frac{\partial b_{i-1}}{\partial x_j}$, except when $b_{i-1}$ is an exact multiple of $1000$, where the partial derivative is undefined. This is because $\left\lceil \frac {b_{i-1}}{1000}\right\rceil$ is constant away from such points.
The number of bananas available to sell at market is $b_{k+1}$. From the relations above, it follows that for all $j \le k,$ $$\frac{\partial b_{k+1}}{x_j} = 2\left\lceil \frac {b_j}{1000}\right\rceil$$
except where the $b_j$ are multiples of $1000$, where it is undefined. At the maximum of $b_{k+1}$, if any such derivative is defined, it has to be $0$. But the only way for it to be $o$ is for $b_j = 0$, which means the camel has failed to arrive with bananas. So the maximum must occur where the derivatives are undefined. I.e., where all the $b_j$ (except the arrival value $b_{k+1}$) are multiples of $1000$.
So for the maximal strategy, either the first stage ends with $1000$ bananas or with $2000$ bananas. We've already seen that $2000$ bananas leads to $533$ bananas to sell. If we allow the camel to eat $2000$ bananas that first stage, we can go $\frac{2000}5 = 400$ km the first stage, and will have $600$ km to go on future stages, leaving only $400$ bananas to sell. So the optimal strategy leaves $2000$ bananas after the first stage and $1000$ after the second stage.