convex conjugate of logistic regression

945 Views Asked by At

For a convex function $f(x)$, its conjugate is defined as $$ f^*(z) = \sup_x \; x^Tz - f(x). $$ For the function $f(x) = \log(1+e^{-x})$, there exists a closed-form solution for $f^*$. But, for a realistic logistic regression case, $$ f(x) = \sum_{i=1}^m \log(1+e^{-a_i^Tx}). $$ Is there an easy way to compute the conjugate $f^*(z)$ given some $z$? Or do I need to solve an optimization problem numerically to get a solution?

Edit: I now have a partial answer to a related problem, which was the reason I was looking for this in the first place. It's not an exact answer to the posed question though.

Fenchel duality tells us the following two problems are duals: $$ p^{*}=\inf _{x\in X}\{f(Ax) + g(x)\},\qquad d^{*}=\sup _{y\in Y}\{-f^{*}(-y)-g^{*}(A^Ty)\}. $$ So, take $f(\theta) = \sum_{i=1}^m \underbrace{\log(1+\exp(-\theta_i))}_{f_i(\theta_i)}$. Then the conjugate of this guy is

$$ f^*(\nu) = \sum_{i=1}^m f_i^*(\nu_i) = \sum_{i=1}^m\log(-\nu^{-1}-1) $$ with an implicit constraint that $-1<\nu<0$.

So, while this doesn't tell you exactly how to compute the conjugate of $f$, it does tell you how to deal with finding dual problems involving logistic regression and some kind of regularization. However, I do find the implicit constraint unsatisfying... it seems to be asking a lot for dual feasibility.

1

There are 1 best solutions below

0
On

Derive convex conjugate for function: \begin{equation} f_i (x) = log(1+e^{-y_ix}) \end{equation} where $y_i$ can be in $\{+1,-1\}$. \ The conjugate of a function is defined as: \begin{equation} f^*(x) := \max_{x} x^Ty-f(x) \end{equation}

{Case(1) }: $y_i = -1$ Plug function (1) inside for first case: \begin{align} f^*(y) := \max_{x} x^Ty-log(1+e^{x}) \end{align} To Find the maximum, we derive w.r.t x and equate to zero: \begin{align} \frac{\partial (x^Ty-log(1+e^{x})) }{\partial x} &= y - \frac{e^x}{1+e^x} = 0\\ y&= \frac{e^x}{1+e^x}\\ \end{align} Take log of both side: \begin{align} log(y) &= x - log(1+e^x) = x - f(x)\\ x^* &= log(y) +f(x) \end{align} Note also that : \begin{align} 1 - y &= 1- \frac{e^x}{1+e^x}\\ 1-y &= \frac{1}{1+e^x}\\ \end{align}

Take log of both side: \begin{align} log(1-y) = -log(1+e^x) = -f(x) \end{align} Plug into original equation the value of $x^{*}$: \begin{align*} f^*(y) =& (log(y) +f(x))^Ty-f(x)\\ f^*(y) =& log(y)^Ty +f(x)^T(y-1)\\ \text{Plug f(x) = -log(1-y):}&\\ f^*(y) =& log(y)^Ty +log(1-y)^T(1-y)\\ \end{align*} $y \in (0,1)$ \

Case(2): $y_i = 1$ \begin{align} f^*(y) := \max_{x} x^Ty-log(1+e^{-x}) \end{align} To Find the maximum, we derive w.r.t x and equate to zero: \begin{align} \frac{\partial (x^Ty-log(1+e^{-x})) }{\partial x} &= y - \frac{-e^{-x}}{1+e^{-x}} = 0\\ -y&= \frac{e^{-x}}{1+e^{-x}}\\ \text{Take log of both side:}&\\ log(-y) &= -x - log(1+e^{-x}) = -x - f(x)\\ x^* &= -log(-y) - f(x) \end{align} Note also that : \begin{align} 1 -(- y) &= 1- (+ \frac{e^{-x}}{1+e^{-x}})\\ 1+ y &= \frac{1}{1+e^{-x}}\\ \text{Take log of both side:}&\\ log(1+y) = - log(1+e^{-x}) =-f(x) \end{align} Finally plug back the value of $x^*$: \begin{align*} f^*(y) =& (-log(-y) - f(x))^Ty-f(x) \\ f^*(y) =& -log(-y)^Ty -f(x)^T(y+1) \\ \end{align*} Plug $f(x) = -log(1+y)$

\begin{align*} f^*(y) =& -log(-y)^Ty +log(1+y)^T(y+1) \\ \end{align*} $y \in (-1,0)$