Calculation of $\mathbb{E}(1(X_i = X_j))$ and its estimation

81 Views Asked by At

My question is the following. Suppose $X_1, X_2,...$ are iid draws from an exponential distribution. Then $$ \mathbb{E}(1(X_i = X_j)) = \int_0^\infty \int_0^\infty 1(u = q) \lambda e ^{-\lambda u} \lambda e^{-\lambda q} \,du\, dq = \int_0^\infty \lambda^2 e^{-2 \lambda u}\, du = \frac{\lambda}{2} $$ However my intuition says that the probability of drawing two identical numbers from a continuous distribution should be zero. To check this I am trying to estimate this expectation. If I am not mistaken, an unbiased estimator would be the following U-statistic $$ U_n = \binom{n}{2}^{-1} \sum_c 1(x_i = x_j) $$ Where $c$ is the set of all combinations $\{i,j\}$ with $i \neq j$ from $\{1,\ldots,n\}$. However when I try to compute this estimate the result is always zero regardless of sample size. Is there any mistake in the computation of the expectation? Is this expectation $0$ or $\frac{\lambda}{2}$? in case it is the latter how do I estimate it? In case it helps here is a piece of code showing how I compute the estimate.

N = 1000;           %Sample size
U = rand(N,1);      %Generate sample from uniform distribution
lambda = 1;         
X = -log(U)/lambda; %Generate sample from exponential distribution

%Compute the estimate
y = 0;
for i=1:N-1
    for j = i+1:N
    y = y + (X(i) == X(j));
    end
end
U_n = (2/(N*(N-1)))*y;
1

There are 1 best solutions below

0
On BEST ANSWER

I think you had a mistake in

$$ \mathbb{E}(1(X_i = X_j)) = \int_0^\infty \int_0^\infty 1(u = q) \lambda e ^{-\lambda u} \lambda e^{-\lambda q} \,du\, dq = \int_0^\infty \lambda^2 e^{-2 \lambda u}\, du = \frac{\lambda}{2} $$

when $u=q$ you correctly replace $q$ by $u$ in $$\lambda e ^{-\lambda u} \lambda e^{-\lambda q}$$. I think the mistake happen when you applied condition $u=q$ in

$$\int_{u\in (0,\infty)}\int_{q\in (0,\infty)}$$

this be

$$\int_{u\in (0,\infty)}\int_{q\in (u,u)}$$(area such that

$$u\in (0,\infty) \hspace{.8cm} q\in (0,\infty) \hspace{.8cm} u=q$$

is a line! and the measure of it is zero)

We know

$\int_{q\in (u,u)}f dq=0$ since

$\int_{A}f dq=0$ when $m(A)=0$.(integrals over $A$ such that $A$ have zero measure is zero integral-over-set-of-measure-zero) so

$$ \mathbb{E}(1(X_i = X_j)) = \int_0^\infty \int_0^\infty 1(u = q) \lambda e ^{-\lambda u} \lambda e^{-\lambda q} \,du\, dq $$

$$=\int_{u\in (0,\infty)}\int_{q\in (u,u)} \lambda e ^{-\lambda u} \lambda e^{-\lambda q} \,du\, dq=0$$