Problem with my simple algorithm to count repetitions

91 Views Asked by At

We have two arrays $A,B$ with sizes $n,m$ respectively. We know that $m \geq n$. We also know that no array contains the same number twice.

Propose an algorithm that prints how many numbers appear in only one of the arrays.

Example:

if $A=[7,-1,8,2]$ and $B=[13,0,2,9,6,10,7]$ then the algorithm will print $7$ because the numbers $-1,0,6,8,9,10,13$ appear in only one of the arrays, and not both.

The runtime of the algorithm must be $\Theta(m\log n)$

What I did:

My algorithm works as follow:

Step 1: Sort $A$. That will take $n\log n$ steps.

Step 2: For each element in $B$, check if that element is in $A$ using binary search. It will take $\log n$ steps for each element, since there are $m$ elements in $B$ it will take overall $m \log n$ steps.

Overall that took us $m\log n+n\log n \leq (m+1)\log n =\Theta(m\log n)$ steps. as required.

The problem is, this only counts the elements of $B$ that are not in $A$. However there could also be elements of $A$ that are not in $B$.

If we do the same thing except we sort $B$ now, it will take $m\log m+n\log m=\Theta(m\log m)$ which is not the same as $\Theta(m\log n)$.

Is there any way to fix this to also count elements of $A$?

1

There are 1 best solutions below

1
On BEST ANSWER

If $A$ has $n$ elements, $B$ has $m$ elements, $p$ are just in $B$, then $m-p$ are in both; and $n-(m-p)$ are just in $A$.