What is the most efficient way to find the most separated pair in a finite sequence such that the first is greater than the second?

119 Views Asked by At

Given some finite sequence $a_n$ for $n \in \{1..N\}$, is there a more efficient method to solve (P1) than the O($N^2$) brute force method?

($P1$) maximise $n_2-n_1$ such that $a_{n_1}>a_{n_2}$

3

There are 3 best solutions below

3
On BEST ANSWER

This is $O(N)$.

If $a_n$ is greater than all the elements before it, call it a peak; and if $a_n$ is less than all elements after it, call it a trough.

Then if $(a_i,a_j)$ is a maximally separated pair satisfying $a_i>a_j$, we know that $a_i$ must be a peak (otherwise there would be an earlier element $a_k$ such that $a_k>a_j$); and similarly, $a_j$ must be a trough.

So here is the algorithm:

  1. Scan forwards through the sequence, marking all peaks ($N$ steps).
  2. Scan backwards through the sequence, marking all troughs ($N$ steps).
  3. Set $p = 1$ (this is the peak index). Note that $a_1$ is a peak by definition.
  4. Set $t$ (the trough index) to be the index of the last trough such that $a_t < a_p$.
  5. Now $t-p$ is a candidate for the maximal separation; if it is the best yet, remember it.
  6. If we have scanned all peaks, we are done. Otherwise, set $p$ to be the index of the next peak and go to Step 4.

Note that we can start the scan in Step 4 from the value of $t$ in the previous iteration, so the total number of operations used by that step in the whole algorithm is $O(N)$.

Edited to add: In fact we can omit Step 1, and just scan for the next peak in Step 6. (We can't omit Step 2, because scanning for troughs has to be done backwards.)

0
On

Alright, I took a crack at it myself and came up with the following approach which seems to be $O\big( (N/M)^2 + N \log(N) \big)$, where $M$ is the optimal value of (P1).

Loop the following over $k$ from $1,2,\dots$:

  • Partition the sequence into $2^k$ contiguous sequences of size at most $\lceil N 2^{-k} \rceil $.
  • Find each group's local minima and maxima, saving the indices as well.
  • Perform this brute-force search among the local maxima / minima to find the best $n_1$ and $n_2$ belonging to different groups
  • if $n_2-n_1 > N 2^{-k}$, break out of this loop and return this pair, since no better pair may be found from within the same group from further subdivision

Each $k$th iteration requires: $O\big( N+ 4^k\big)$ operations, so summing over all necessary $k$ gives the total complexity $O\big((N/M)^2 + N \log(N) \big)$, where the former term dominates for small $M$ and the latter terms dominates for $M \gtrsim \sqrt{N}$.

2
On

I believe you can do this in $O(N \log N)$, but there may be a faster way. Here is a high level description of the algorithm:

  1. Store the elements as tuples of $(value, index)$ where $value$ is the value of the element and index is its position in the original array. So for ex. if your input was [3, 1, 4, 2], you would store it as [(3,0), (1,1), (4,2), (2,3)]. This will be $O(N)$
  2. Sort this new array by the first element (value field) in the tuple in decreasing order. Thus previous example would now look like: [(4, 2), (3, 0), (2, 3), (1, 1)]. This can be done in $O(N\log N)$
  3. Find the maximum difference between second elements (index fields) of the tuples such that the larger one comes after the smaller one in the new array. This can be done in $O(N)$.

Total complexity = $O(N \log N)$.

I think steps $1$ and $2$ are pretty self explanatory. If you are curious about how step $3$ can be done in $O(N)$, here is some code:

def max_diff_indexes(arr):
    """
    @params:

        arr - This is just an array of the SECOND parts of the tuple in 
        the algorithm described above. That is, arr contains only ints 
        representing the second values of the tuple (index field) after 
        the sorting in step 2. 

    @return:

        max_pair - Returns a pair of indexes n_1 and n_2 that maximise 
        n_2 - n_1 such that a_{n_1} > a_{n_2}. The first property is 
        ensured by this algorithm while the second property in ensured
        by the fact that the array was initally sorted in decresing order
        of value

    """
    if not arr: return None
    min_val = arr[0]
    max_diff = 0
    for elem in arr:
        max_diff = max(elem - min_val, max_diff)
        min_val = min(elem, min_val)

    max_val = max_diff - min_val
    return min_val, max_val