It has been years since I last studied Probability, please feel free to correct any mistakes in my approach.
What is the average number of distinct numbers in N random non-negative numbers less than a thousand?
Given integers from 0 to 999 (inclusive). Choose N integers such that all integers are equally likely to be selected. How many distinct integers do you have in N on average?
[Edit:
I changed the following question (from a textbook about algorithms) into the one I asked above.
What is the average number of distinct keys that FrequencyCounter will find among N random nonnegative integers less than 1,000, for N=10, 10^2 , 10^3 , 10^4, 10^5, and 10^6 ?
My understanding of the question is from 1...999 you select N numbers randomly, that is in no specific way. I think we can assume selecting each number is equally likely. Repetitions are allowed. Now do this for infinite times. Take a note of all results for each experiment somehow and assume all those results are equally likely. Now pick one of the results. How many distinct numbers are that in that result on average?
]
Here is my attempt:
I can choose 1000^N different lists of numbers, that is any number may repeat with any frequency in a list of size N.
If I were to choose sets (distinct numbers in the list) of size N, then the number of different sets should be P(1000, N) (permutations). However in the context of the question order of the elements should not be significant, so the correct number should be C(1000, N) (combinations).
The probability of me choosing all distinct numbers should be Px = C(1000, N)/(1000^N).
Now, how do I get to calculate an average number? Is it 1000 * Px? Somehow this doesn't sound right.
By the way, is my understanding of average number correct? I mean, I do this experiment infinite times and the number of distinct numbers in that set converges to a number which we call average? Though I am guessing that number should depend on the distribution of distinct numbers and calculated accordingly.
Not a complete answer, but a rephrasing of the question and some remarks related to your questions.
I think your question can be reduced to the following.
Consider the set $U$ of all sequences $$ u_1, u_2, \ldots, u_n, $$ where each $u_1$ is an integer between $1$ and $1000$. $U$ can be made into a probability space by assigning equal probability, $\frac{1}{1000^n}$ to each sequence in $U$.
For a typical sequence $u$, the number of distinct items in the sequence $u$ is some number between $1$ and $n$; let's call that $f(u)$. Then $f$ is a function from $U$ to the set $\{1, \ldots n\}$. It has an "average value", which is $$ Av(f) = \sum_{u \in U} f(u) p(u) $$ where $p(u) = \frac{1}{1000^n}$ is the probability associated to $u$. Because this is a constant, it can be pulled out of the sum to get $$ Av(f) = \frac{1}{1000^n} \sum_{u \in U} f(u), $$ which reflects your idea of "count up distinct items in each list and average them".
The thing you computed in your description is rather different. There's a different function $g : U \to \{0, 1\}$ which assigns $1$ to $u$ if all items in the sequence are distinct, and $0$ otherwise. And you've figured out that the average value for $g$ is $C(1000, n) / 1000^n$. Unfortunately, that average value doesn't tell you much of anything about the average value of $f$.
Actually, it does tell you something. For the sum $$ \sum_{u \in U} f(u) $$ over all sequences can be estimated very crudely. Let's let $S \subset U$ consist of exactly those sequences whose entries are all different. Then $$ \sum_{u \in U} f(u) \ge \sum_{u \in S} f(u) $$ And for each $u \in S$, we know what $f(u)$ is: it's just $n$. So we can say \begin{align} \sum_{u \in U} f(u) &\ge \sum_{u \in S} f(u) \\ &= \sum_{u \in S} n \\ &= n \sum_{u \in S} 1 \end{align} which is just $n$ times the number of elements in $S$, which you've already determined is $C(1000, n)$. So we know that $$ Av(f) \ge \frac{1}{1000^n} \left(n \cdot C(1000,n) \right). $$ But that is, of course, just a crude estimate. So while your approach was partly correct, and told you something it doesn't really get you where you need to go.
To actually get the answer you need, you somehow need to sum up $f(u)$ for all $u \in U$. And to do that, @lulu's idea of linearity of expectation is probably the right road to go down, but I leave that to someone else to write out. I just wanted to (1) try to phrase the question unambiguously, and (2) try to address the question of how to get from the work you'd done to the answer you needed (short answer: you really can't, but you can get a weak estimate).