Can you always apply the log transform over a positive inequality constraint?

48 Views Asked by At

I'm having trouble understanding when can you take the log of a constraint in a general way (is this something you can always do for positive inequalities $\geq 1$, or there are limitations?)...it's something that was done in this article: https://ai.stanford.edu/~ang/papers/nips02-metric.pdf. I will state the problem nonetheless:

Suppose we define the Mahalanobis distance, with $A$ being a positive definite matrix.: $$ \newcommand{\norm}[1]{\left\lVert#1\right\rVert} \begin{align} d(x, y) = d_A(x, y) = \norm{x-y}_A = \sqrt{(x-y)^TA(x-y)} \end{align} $$

And then a set of pairs of points we say are "similar": $$ \begin{align} & \mathcal{S}: (x_i, x_j) \in \mathcal{S} \quad \text{if $x_i$ and $x_j$ are similar} \\ & \mathcal{D}: (x_i, x_j) \notin \mathcal{S} \end{align} $$

We want to learn an $A$ such that items deemed "similar" have a small distance. To avoid the trivial solution $A=0$ we include a constraint that "dissimilar" items MUST have a distance greater than 1. So an optimization problem can be defined:

$$ \begin{align} \min_{A} & \sum_{(x_i, x_j) \in \mathcal{S}} \norm{x_i-x_j}^2_A \\ \text{s.t.} & \sum_{(x_i, x_j) \in \mathcal{D}} \norm{x_i-x_j}_A \geq 1 \\ & A \succeq 0 \end{align} $$

In the article, for the case of a diagonal $A$, they wrote a function which is equivalent to the optimization problem from above. $$ \begin{align} g(A)=g(A_{11},...,A_{nn}) = \sum_{(x_i, x_j)\in \mathcal{S}} \norm{x_i-x_j}^2_A - \log \left(\sum_{(x_i, x_j)\in \mathcal{D}} \norm{x_i-x_j}_A \right) \end{align} $$

My question...Why $log$? When using Lagrange multipliers, that should have been written as $\sum_{(x_i, x_j)\in \mathcal{S}} \norm{x_i-x_j}^2_A-\lambda (\sum_{(x_i, x_j) \in \mathcal{D}} \norm{x_i-x_j}_A - 1 - s^2) $ (here $s$ is a slack variable). Can you take the $log$ anytime the constraint inequality is positive definite? Also, could you please share some books where this is treated in a general way?