Compute variance of estimator of logarithm of random variable

122 Views Asked by At

I want to find the variance of an estimator. Moreover, I would like to be as formal as possible. Can you please help me checking eventual mistakes and help me formalizing this proof?

Imagine we have a symmetric positive definite matrix $X \in \mathbb{R}^{n \times n}$, with singular value decomposition $U\Sigma V^T$.

In the diagona matrix $\Sigma$ there are all the $n$ singular values $\sigma_i \in [\delta,1]$ for a small $\delta \in (0,1/2)$. We sample uniformly a subset of $m$ $\sigma_i$. For $\epsilon>0$, we want to estimate value for:

$$\mu = \sum_{i}^n \log \sigma_i$$

such that

$$ |\mu - \overline{\mu} | \leq \epsilon$$

The estimator we use if the following. We sample uniformly from the diagonal matrix $\Sigma$, to get a number $m$ of elements $\sigma_i$. The number of samples $m$ is chosen according to a Hoeffeding bound, i.e. we want that with high probability $$ Pr[| \frac{\sum_i^m \log \sigma_i}{m} - \mathbb{E}[\log \sigma_i]| > \epsilon ] \leq e^{-\frac{2m\epsilon^2}{(1-\delta)^2}} $$

So we pick $m=O(\frac{1}{\epsilon^2})$.

We conclude with our estimate as: $$\overline{\mu}=n * \frac{1}{m}\sum_{i=1}^m \log \sigma_i$$.

What is the variance of our estimator $\overline\mu$? This is my attempt:

We want to study the variance of an estimator, that is: $Var(\overline{\mu}) = \mathbb{E}[(\overline{\mu}-E[\overline{\mu}])^2]$. We can use the Bienaymé formula, which tells us that for some independent random variables $X_i$, we have that $Var(\sum_i X_i) = \sum_i Var(X_i)$

So now we reduced the problem of studying the variance of the estimator to studying the variance of a single shot of $\log \sigma_i$.

Let's observe juust the interval of possible values of the estimator $\overline{\mu}$:

$$[n \mathbb{E}[\log \sigma_i]-\epsilon n \mathbb{E}[\log \sigma_i], n \mathbb{E}[\log \sigma_i]+\epsilon n \mathbb{E}[\log \sigma_i]$$

Applying the Bienaymé formula we have that

$$Var(\overline \mu) = Var(n \frac{1}{m}\sum_i^m \log \sigma_i ) = \frac{n^2}{m^2}\sum_i^m Var(\log \sigma_i) = \frac{n^2Var(\log \sigma_i)}{m}$$

As we sample the $\sigma_i$ uniformly, we can equivalently assume that we sample the $\log \sigma_i$ uniformly. We know that the variance of the uniform distribution is on an interval $[a,b]$ (which for us is $[\log \delta, 0]$ ) is: $\frac{1}{12}(b-a)^2$, and thus the variance of $\log \sigma_i$ is $\frac{1}{12}(\log(\delta)^2)$.

Putting this together, we got that:

$$Var(\overline{\mu}) = \frac{1}{12m}(n\log \delta)^2)$$

Is this correct?