I came across this problem yesterday and I was struggling understanding the solution. I'll give my version for the proof of correctness of the algorithm (leetcode lists such a problem as a greedy algorithm). I've seen few proofs around but none of them really satisfies me so I gave it a go between last night and this morning and I am sharing what the outcome is (hopefully correct).
Statement of the problem:
You are given an integer array height of length n. There are n vertical lines drawn such that the two endpoints of the ith line are (i, 0) and (i, height[i]). Find two lines that together with the x-axis form a container, such that the container contains the most water. Return the maximum amount of water a container can store. Notice that you may not slant the container.
The question is whether or not what I derived is correct.
The solution been given, in code, is the following:
int maxArea(vector<int>& h) {
int i=0,j=h.size()-1;
int area=0;
while(i<j){
area=max(area,min(h[i],h[j])*(j-i));
if(h[i]<h[j])
i++;
else
j--;
}
return area;
}
Which can be rewritten in a recursive fashion as follows
int maxArea(vector<int>& h, int i, int j) {
if(i >= j)
{
return 0;
}
else if(h[i] < h[j])
{
return max(h[i]*(j-i),maxArea(h,i+1,j));
}
else
{
return max(h[j]*(j-i),maxArea(h,i,j-1));
}
}
int maxArea(vector<int>& h) {
int n = h.size();
return maxArea(h,0,n-1);
}
If we analyse the recursion defined by
$$ V(i,j) = \begin{cases} 0 & i \geq j \\ \max \left\{ V(i+1,j), h_i(j-i) \right\} & h_i < h_j \\ \max \left\{ V(i,j-1), h_j(j-i) \right\} & \text{otherwise} \end{cases} $$
And we show that for $0 \leq i < j \leq n - 1$ and $a_{ij} = \min \left\{ h_i, h_j\right\}(j - i)$ we have
$$ V(i,j) = \max_{i \leq k < l \leq j} a_{kl} $$
we're done. I did this by induction on $j - i = n$, if $n = 1$ we have $i = 0, j = 1$ and $$ V(0,1) = \begin{cases} \max \left\{ V(1,1), h_0 \right\} & h_0 < h_1 \\ \max \left\{ V(0,0), h_1 \right\} & \text{otherwise} \end{cases} = \begin{cases} \max \left\{ 0, h_0 \right\} & h_0 < h_1 \\ \max \left\{ 0, h_1 \right\} & \text{otherwise} \end{cases} = \\ \begin{cases} h_0 & h_0 < h_1 \\ h_1 & \text{otherwise} \end{cases} = \min \left\{ h_0, h_1 \right\} = a_{01} $$
Suppose now that the statement holds if $j - i = n - 1$, we want to show it holds for $j - i = n$. I need to consider two cases $h_i < h_j$ and $h_i \geq h_j$ however since the proof is very similar I'll just prove the former.
The recursion in this case is simplified as
$$ V(i,j) = \max \left\{ V(i+1,j), h_i(j-i) \right\} $$
Observe now since $j - i = n$ by hypothesis we have $j - (i + 1) = n - 1$ so the inductive hypothesis holds and we can rewrite
$$ V(i,j) = \max \left\{ \max_{i+1 \leq k < l \leq j} a_{kl}, h_i(j-i) \right\} $$
Observe now that since for $i < l \leq j$ we have (just by definition of $\min$) $$ \min \left\{h_i, h_l\right\} \leq h_i \Rightarrow \min \left\{h_i, h_l\right\}(l - i) \leq h_i(l - i) \Rightarrow \min \left\{h_i, h_l\right\}(l - i) \leq h_i(j - i) $$
However since $h_i < h_j$ we specifically have $$\min \left\{h_i,h_j\right\}(j - i) = h_i(j-i)$$.
$$ \max_{i < l \leq j} \min \left\{h_i,h_l \right\}(l - i) = \min \left\{h_i,h_j \right\} (j - i) \Rightarrow \max_{i < l \leq j} a_{il} = a_{ij} $$.
I can therefore re-write $V(i,j)$ as
$$ V(i,j) = \max \left\{ \max_{i+1 \leq k < l \leq j} a_{kl}, h_i(j-i) \right\} = \max \left\{ \max_{i+1 \leq k < l \leq j} a_{kl}, \max_{i < l \leq j} a_{il} \right\} = \max_{i \leq k < l \leq j} a_{kl} $$
A final observation is the the branch in the if statement of my recursion, which means I don't have to solve all the subproblems. The solution of the final problem is given by calculating $V(0,n-1)$, which is a $O(n)$ time solution now, more specifically $\Theta(n)$.
Consider 2 sequences:
$b$ is defined symmetrically:
Note that $k \mapsto h[a_{k}]$ is strictly increasing, so is $k \mapsto h[b_{k}]$.
Any bucket whose left edge isn't in $a$ can be made larger by moving the left edge left.
Any bucket whose right edge isn't in $b$ can be made larger by moving the right edge right.
So only buckets whose left edge is in $a$ and whose right edge is in $b$ are worth consideration.
If the $h[a_s]$-edge is shorter than the $h[b_t]$ edge, then the area is just $h[a_s] \times (b_t - a_s)$. So you want to compare it with the furthest right $b$-edge s.t. $h[a_s] \le h[b_t]$, which is easy because both sequences are strictly increasing.
When $h[b_t]$ is the shorter edge the reasoning is symmetric.
The fact that the algorithm also compares non $a$-$b$ edges is just vacuous, they'll never proc the
area=max(area,...)line.