Consider the optimization problem with $X \in \mathbb R^{m \times n}$ and $y \in \mathbb R^n$ \begin{equation} \min_{r \in \mathbb R^m, w \in \mathbb R^n} - \sum_{i=1}^n \log(1 + r_i) \\ \end{equation} subject to $r = Xw- y$ and $r_i \geq 0.$ Derive the dual problem.
So the Langrangian is $$ L(r, w, \lambda, \mu) = - \sum_{i=1}^n \log(1 + r_i) - \langle \lambda, r \rangle + \langle \mu, Xw - y - r) $$ with $\lambda \in \mathbb R_{\geq 0}^m$ und $\mu \in \mathbb R^m$.
The dual function is defined as $$ q(\lambda, \mu) = \inf_{r, w} L(r, w, \lambda, \mu). $$ How does one find this infimum?
The Lagrangian is convex and differentiable in the primal variables $(r,w)$, so to find the infimum you can simply set the gradient to zero. The $j$th partial with respect to $r$ is \begin{equation*} \frac{\partial L}{\partial r_j} = -\frac{1}{1+r_j} - \lambda_j - \mu_j, \end{equation*} and the gradient with respect to $w$ is \begin{equation*} \nabla_w L= X^\top \mu. \end{equation*} Setting these to zero, we obtain the system of equations \begin{align*} 1 + (\lambda_j+\mu_j)(1+r_j) &= 0, \\ X^\top \mu&=0. \end{align*} The second condition is a constraint on $\mu$ and is needed since the Lagrangian is affine (and therefore unbounded below) in $w$. Solving the first equation for $r_j$ we obtain \begin{equation*} r_j = -1-\frac{1}{\lambda_j+\mu_j}. \end{equation*} Substituting these results back into the Lagrangian, we find that the dual function becomes \begin{equation*} q(\lambda,\mu) = \begin{aligned} \begin{cases} -\sum_{j=1}^n\log(-\tfrac{1}{\lambda_j+\mu_j}) + \sum_{j=1}^n(\lambda_j+\mu_j+1)-\mu^\top y & \text{if $X^\top \mu = 0$}, \\ -\infty & \text{otherwise}. \end{cases} \end{aligned} \end{equation*} Clearly, the dual problem amounts to maximizing the top case of $q$ subject to the constraints $X^\top \mu = 0$ and $\lambda\ge 0$. Also note that the dual function imposes an implicit constraint on the pair $(\lambda,\mu)$, since we must have $\frac{1}{\lambda_j+\mu_j}<0$ for all $j$.