F - measure in Clustering

113 Views Asked by At

We can define the F - measure as follows:

$$F_\alpha=\frac{1}{\alpha \frac{1}{P}+(1-\alpha)\frac{1}{R}} $$

Now we might be interested in choosing a good $\alpha$. In the article The truth of the F-measure the author states that one can choose the conditions:

$$\beta=\frac R P, \text{ where } \frac{\partial F_{\alpha}}{\partial P} = \frac{\partial F_\alpha}{\partial R}$$

and then we obtain $\alpha=1/(\beta^2+1)$ and

$$F_\beta=\frac{(1+\beta^2)PR}{\beta^2 P+R} $$

It is said that The motivation behind this condition is that at the point where the gradients of $E$ w.r.t. $P$ and $R$ are equal, the ratio of $R$ against $P$ should be a desired ratio $\beta$. I understand that the condition will guarantee that the user is willing to trade an increment in precision for an equal loss in recall. But I do not get why the equality of both partial derivatives correspond to these hypothesis. I would rather understand when one partial derivative equals the other partial derivative multiplied by minus one. Could anyone explain me why the desired condition (condition in words) correspond to this equality (condition in math terms)?

EDIT:

Well, we could do the following:

$$\partial F=\frac{\partial F_{\alpha}}{\partial P}\partial P+\frac{\partial F_\alpha}{\partial R}\partial R.$$

And since we want for $\partial P=-\partial R$ that $\partial F=0$, we obtain easily the condition.

But I have one problem with this: Since $\left. \frac{\partial F_\alpha}{\partial P} \right/ \frac{\partial F_\alpha}{\partial R}=1$, and the fact that the gradient is perpendicular to each level curve (https://ocw.mit.edu/courses/mathematics/18-02sc-multivariable-calculus-fall-2010/2.-partial-derivatives/part-b-chain-rule-gradient-and-directional-derivatives/session-36-proof/MIT18_02SC_pb_32_comb.pdf) we would have that the level curve must have $m=-1$. Nonetheless, when I calculate the level curve for some constant $c$ I get the result,

$$R(P)=\frac{c(1-\alpha ) P}{P-c\alpha},$$

which clearly is not a linear function with $m=-1$. What am I missing?

1

There are 1 best solutions below

10
On

I will stick to this understanding of yours, because it seems very reasonable:

I understand that the condition will guarantee that the user is willing to trade an increment in precision for an equal loss in recall.

Assuming that, the problem is to find a parameter $\alpha$ such that

$$\frac{\partial F_\alpha}{\partial P} = \frac{\partial F_\alpha}{\partial R}$$

Ok, lets find out what the $\alpha$ should be.

We compute partial derivatives and we get that $$\frac{\partial F_\alpha}{\partial P} = \frac{\alpha}{(\alpha \frac{1}{P}+(1-\alpha)\frac{1}{R})^2P^2} \quad \frac{\partial F_\alpha}{\partial R} = \frac{1 - \alpha}{(\alpha \frac{1}{P}+(1-\alpha)\frac{1}{R})^2R^2}$$

Hence we obtain that $$\frac{\alpha}{P^2}=\frac{1 - \alpha}{R^2}$$ Thus after some computation you get that $$\alpha=\frac{1}{\beta^2+1},$$ where we denote $\beta = \frac{R}{P}$


But I do not get why the equality of both partial derivatives correspond to these hypothesis.

In order to "get" this you just need to have a feeling what the partial derivatives indicate. In your case, if we change $P$ to $P+\Delta P$ then the $F_\alpha$ will increase (or decrease if this partial derivative is negative) by $\Delta P \cdot \frac{\partial F_\alpha}{\partial P}$. Similarly with respect to $R$. So equality of partial derivatives means precisely that the same increment of precision and recall will result of the same increment of $F_\alpha$.

If my explanations is not clear, there are plenty of articles about partial derivatives which will help you develop the intuition


Edit. Ok I see it right now. There is a logical mistake that I was caught into as well. Again, we start with the family of functions $F_\alpha$ parametrised by $\alpha$ and we wish to find such $\alpha$ where $$\frac{\partial F_\alpha}{\partial P} = \frac{\partial F_\alpha}{\partial R}$$

We assumed that $\frac{\partial F_\alpha}{\partial P} = \frac{\partial F_\alpha}{\partial R}$ and we get the $\alpha$ must equal $\frac{1}{1 + \beta^2}.$

Thus we only proved an impication:

$$\frac{\partial F_\alpha}{\partial P} = \frac{\partial F_\alpha}{\partial R} \implies \alpha = \frac{1}{1 + \beta^2}$$

It doesn't mean that the converse holds as well. And it doesn't! In fact, if we compute $F_\alpha$ with this special $\alpha$ we get that

$$F_\alpha(P,R) = \frac{P^2 + R^2}{P + R}$$

and its partial derivatives differ: $$\frac{\partial F_\alpha}{\partial P} = \frac{P^2 + 2PR - R^2}{(P + R)^2}\quad \frac{\partial F_\alpha}{\partial R} = \frac{- P^2 + 2PR + R^2}{(P + R)^2}$$

Conclusion: In the family $\{F_\alpha\}_\alpha$ there is no such $\alpha$ so that $$\frac{\partial F_\alpha}{\partial P} = \frac{\partial F_\alpha}{\partial R}$$

Appendix. You can see level surfaces in wolfram. They are circles: http://www.wolframalpha.com/input/?i=(x%5E2%2By%5E2)%2F(x%2By)