Estimate of a sample size of a search algorithm

38 Views Asked by At

I am trying to estimate the order of the sample size of a probabilistic search algorithm. The problem is the following: I have a counting problem where an oracle can identify the right solution, i.e. returns $1$ if the sample is a solution to the problem and $0$ otherwise. We assume there are $N$ items (search space) to check and $M$ are correct (the solution to our search problem).
During the procedure the algorithm can draw only $k$ samples from the search space which will lead to a series of oracle outputs $X_1, \ldots, X_k$ (hence a sequence of $1$s and $0$s). The algorithm then returns the estimate $S = N \cdot \sum_j X_j /k $
I now want to find the standard deviation of this algorithm in order to find the the number of samples for which the probability of estimatimating $M$ within an error of $\sqrt{M}$ is at least $3/4$.

Can anyone help me how to calculate it? I started with $(\Delta S)^2 = \sum_i \left( \left<S \right> - S_i \right)$ where $i$ is the number of all possible samples (i.e. $N$ over $k$ i suppose). But when plugging in $S$ i didn't know how to proceed. I know the right solution of the std has to be $\Delta S = \sqrt{M(N-M)/k}$ but i can't see neither how to get there nor how to proceed.

1

There are 1 best solutions below

3
On BEST ANSWER

The variance, which is the square of the standard deviation, is $\sigma^2=\overline{S^2}-(\overline{S})^2$.
The expected value of $S$ is $\overline S=\sum_i\frac NkE(X_i)=\sum_i\frac Nk\frac MN$.
The expected value of $S^2$ is $$\overline{S^2}=\frac{N^2}{k^2}\left(kE(X_i^2)+k(k-1)E_{i\neq j}(X_iX_j)\right)$$

That is because there are $k^2$ terms in $S^2$, $k$ of these terms involve the square of the same $X_i$, and the other $k(k-1)$ involve a pair of different $X_i$.

What is the average value of $X_i^2$? What is the average value of $X_iX_j$ when we know they are different samples? Hint: Count the possibilities.