Matching two lists

151 Views Asked by At

I have a fairly straightforward question to which I do not have the answer, even if it somehow seem simple in my head.

Take two list of unique records, say it's two lists of National insurance numbers.

List A holds 1000 records

List B holds 20000 records

I need to know if the IDs in list A exist in list B.

Is it more efficient to

  1. go through each item of list A and look through list B to find a match or
  2. go through each 20 000 of list B and try to find a match

By efficient, I mean that takes the less amount of back and forth between both lists.

Now in real life, I would logically say 1, although I am not good enough in maths to prove it.

But it get a bit more complicated because, I am actually building a computer program that will have to do this operation. The computer will always look through the whole list every time it tries to match a record. So basically, it will try to match 1 000 items 20 000 times or 20 000 items 1 000 times - I have the feeling that it's actually a bit more complicated than that :) - Can someone calculate if there is a more efficient ways between solutions 1 and 2?

Please note that this question is not related on how to build the program. As the difference between the two solution is in milliseconds and is not important. I am just very curious of the Maths behind the logic.

1

There are 1 best solutions below

1
On

In both cases, you will have to make an equal number of comparisons.

To make it simpler, let's make list A of size 5 and list B of size 10.

A { !, @, #, $, % }

B { !, $, &, #, *, (, @, %, ~, ? }

Let's apply the first technique of iterating through list A.

For element 1 - 1 back and forth (comparing lists from left to right)

For element 2 - 6 back and forths (don't need to look at ! because the list is unique)

For element 3 - 3 back and forths

For element 4 - 1 back and forth

For element 5 - 4 back and forths

Total: 15 back and forths

B {!, $, &, #, *, (, @, %, ~, ?}

A {!, @, #, $, %}

Let's apply the second technique of iterating through list B.

For element 1 - 1 back and forth

For element 2 - 3 back and forths

For element 3 - 3 back and forths - not found

For element 4 - 2 back and forths

For element 5 - 2 back and forths - not found

For element 6 - 2 back and forths - not found

For element 7 - 1 back and forth

For element 8 - 1 back and forth

For element 9 - 0 back and forth (it cannot be in the unique list)

For element 10 - 0 back and forth (it cannot be in the unique list)

Total: 15 back and forths

The order through which you progress through both lists is also identical:

{!, dollar sign, &, #, *, (, @, dollar sign, &, #, $, &, *, (, %} 15 elements