I have a fairly straightforward question to which I do not have the answer, even if it somehow seem simple in my head.
Take two list of unique records, say it's two lists of National insurance numbers.
List A holds 1000 records
List B holds 20000 records
I need to know if the IDs in list A exist in list B.
Is it more efficient to
- go through each item of list A and look through list B to find a match or
- go through each 20 000 of list B and try to find a match
By efficient, I mean that takes the less amount of back and forth between both lists.
Now in real life, I would logically say 1, although I am not good enough in maths to prove it.
But it get a bit more complicated because, I am actually building a computer program that will have to do this operation. The computer will always look through the whole list every time it tries to match a record. So basically, it will try to match 1 000 items 20 000 times or 20 000 items 1 000 times - I have the feeling that it's actually a bit more complicated than that :) - Can someone calculate if there is a more efficient ways between solutions 1 and 2?
Please note that this question is not related on how to build the program. As the difference between the two solution is in milliseconds and is not important. I am just very curious of the Maths behind the logic.
In both cases, you will have to make an equal number of comparisons.
To make it simpler, let's make list A of size 5 and list B of size 10.
A { !, @, #, $, % }
B { !, $, &, #, *, (, @, %, ~, ? }
Let's apply the first technique of iterating through list A.
For element 1 - 1 back and forth (comparing lists from left to right)
For element 2 - 6 back and forths (don't need to look at ! because the list is unique)
For element 3 - 3 back and forths
For element 4 - 1 back and forth
For element 5 - 4 back and forths
Total: 15 back and forths
B {!, $, &, #, *, (, @, %, ~, ?}
A {!, @, #, $, %}
Let's apply the second technique of iterating through list B.
For element 1 - 1 back and forth
For element 2 - 3 back and forths
For element 3 - 3 back and forths - not found
For element 4 - 2 back and forths
For element 5 - 2 back and forths - not found
For element 6 - 2 back and forths - not found
For element 7 - 1 back and forth
For element 8 - 1 back and forth
For element 9 - 0 back and forth (it cannot be in the unique list)
For element 10 - 0 back and forth (it cannot be in the unique list)
Total: 15 back and forths
The order through which you progress through both lists is also identical:
{!, dollar sign, &, #, *, (, @, dollar sign, &, #, $, &, *, (, %} 15 elements