Minimizing a function (error rate of a Bloom filter)

67 Views Asked by At

In a set of lectures I am watching, the following quantity emerges after some analysis of a particular system:

$$ \epsilon=(1-e^{-x/a})^x $$

We wish this quantity to be minimized. That is, we wish to find the relationship between $x$ and $a$ that minimizes $\epsilon$. The lectures claim that this happens when $x\approx a\ln(2)$.

I can verify this numerically, but I cannot see how it is derived anlytically.

In fact, my derivation seems to suggest that a minimum should not exist. I am differentiating $\epsilon$ wrt $a$ and setting to $0$, which yields:

$$ \frac{d\epsilon}{da}= \frac{e^{-x/a}(1-e^{-x/a})^{x-1} x^2}{a^2}=0 $$

I cannot see how this can have a solution other than $a\rightarrow 0$ or $x \rightarrow \infty$.

2

There are 2 best solutions below

0
On BEST ANSWER

I will assume that $a > 0$. We substitute $u = 1 - e^{-x/a}$ and make the following observations:

  1. If $x$ ranges over $(0, \infty)$, then $u$ ranges over $(0, 1)$.

  2. $x = -a \log(1-u)$, and so, we have

    $$ (1 - e^{-x/a})^x = \exp\{ -a \log(u)\log(1-u)\}. $$

  3. The function $u \mapsto -\log(u)\log(1-u)$ is strictly convex on $(0,1)$.1) Since it is symmetric around $u = 1/2$, it achieves the global minimul at $u = 1/2$ with the value $-\log^2 2$.

Therefore

$$ (1 - e^{-x/a})^x \geq e^{-a \log^2 2} $$

and the equality holds exactly when $1 - e^{-x/a} = \frac{1}{2}$, or equivalently, $x = a \log 2$.


1) One of the quickest way to prove the convexity of $u \mapsto -\log(u)\log(1-u)$ is probably to utilize the identity

$$ -\log(u)\log(1-u) = \int_{0}^{1} \frac{\log(1-x) - \log(1-ux) - \log(1-(1-u)x)}{x} \, \mathrm{d}x. $$

The integrand is already a convex function in $u$ for each fixed $x$, and so, its integral with respect to $x$ is also convex in $u$.

0
On

Consider$$y=\epsilon ^{\frac{1}{a}}=\left(1-e^{-\frac{x}{a}}\right)^{\frac{x}{a}}$$ Let $x= at$ to make $$y(t)=\left(1-e^{-t}\right)^t\implies \frac{dy(t)}{dt}=\left(1-e^{-t}\right)^t \left(\frac{t \,e^{-t} }{1-e^{-t}}+\log \left(1-e^{-t}\right)\right)$$ So, we look for the zero of function $$f(t)=\frac{t \,e^{-t} }{1-e^{-t}}+\log\left(1-e^{-t}\right)$$ the solution of which being exactly $t=\log(2)$ that is to say $x=a\log(2)$ (not $\approx$).

For this value $y=2^{-\log (2)}$ and $\epsilon=2^{-a \log (2)}$.

At this point, the second derivative test $$\frac{d^2y(t)}{dt^2}=2^{1-\log (2)} (1-\log (2)) >0$$ confirms that this is a minimum.