The Boyd Convex Optimization book defines the Lagrange dual function as $$ g(\lambda, \nu) = \inf_{x \in D} L(x, \lambda, \nu) = \inf_{x \in D} (f_0(x) + \lambda^T f(x) + \nu^T h(x) $$ where $f$, $h$ are the inequality and equality constraint functions, and the problem domain is defined as $D=\operatorname{dom} f_0 \cap \operatorname{dom} f \cap \operatorname{dom}g$. Also define the feasible set as $X = \{x \in D | f(x) \leq 0, h(x) =0 \}$, and primal solution value $p^*=\inf_{x\in X}f_0(x)$. I have the following questions:
If I define $g(\lambda, \nu) = \inf_{x \in S} L(x, \lambda, \nu)$, where $S$ is any convex set such that $X \subset S$, it seems that weak/strong duality theorems would still hold. In particular, we still have for any feasible $\tilde x \in X$ and any $\lambda \geq 0, \nu$, $$g(\lambda, \nu) = \inf_{x \in S} L(x, \lambda, \nu) \leq L(\tilde x, \lambda, \nu) \leq f_0(x)$$ hence $g(\lambda, \nu) \leq p^*$ (by ranging over all $\tilde x \in X$). Provided the set $S$ is easy to optimize over, wouldn't this be a useful way to get a potentially tighter dual lower bound on $p^*$ than is typically obtained by minimizing $L$ over $D$, since $\inf_{x \in D} L(x, \lambda, \nu) \leq \inf_{x \in S} L(x, \lambda, \nu)$?
In most of the textbook examples, the domain $D$ is simple (e.g., $x \geq 0$), and $g$ can be found analytically. This may not always be the case in practice. What if $D$ is a more complicated set? Would it make sense to also use Lagrange multipliers and KKT conditions in finding $g$? (Also see this question, basically asking about general approaches for handling (potentially analytically intractable) dual functions)
Yes. Let me give an example. Suppose you'd like to derive the dual of $\min_{x \in S} \{ x^2 : x \geq 1\}$ where $S=\mathbb{R}$. You could consider $D=\mathbb{R}_+$ and have the same problem. The duals are different: $\sup_y \{ y - 0.25 y^2 \}$ vs $\sup_y \{y - 0.25\max\{0,y\}^2\}$. For negative $y$, the second dual provides a tighter dual bound. The practical value is debatable, because I have yet to see someone plug in a random value into the dual to obtain a useful bound.
If you can find $g$ with the use of KKT conditions, I would say $D$ is still simple. In the most extreme case, $D=X$, it makes sense to use KKT conditions.