I am working for my oral exam. Here is a question that you might want to help me with.
"Define a constrained minimization problem for a smooth (convex) function with smooth constraints. Provide a characterization for the gradient of the function at the minimum and discuss how the condition for un-constrained problems follows as a special case."
I do not understand what my teacher wants. In the first question, do he want me to give a concrete example of constrained minimization problem for a smooth (convex) function with smooth constraints or just the general theory regarding constrained minimization problem for a smooth (convex) function with smooth constraints? What do he mean with "Provide a characterization for the gradient of the function at the minimum"?
I hope someone can help me to get a better understanding. Thanks in advance.
I think that they want you to give an abstract example, like the following. Let $f$ and $g$ be differentiable functions and consider
$$\min\limits_{x:g(x)\leq 0} f(x)$$
The optimality conditions at a solution $\bar{x}$ for this problem are
$$0\in \nabla f(\bar{x}) + N_{\{g\leq 0\}}(\bar{x})$$
where $N_{\{g\leq 0\}}(\bar{x})=\{u: \forall y\in \{g\leq 0\}, \langle u, y-\bar{x}\rangle \leq 0 \}$ is the normal cone to the sublevel set $\{g\leq 0\} = \{x:g(x)\leq 0\}$.
However, the normal vector to the set $\{g\leq 0\}$ is given by $-\nabla g(u)$ for each boundary point $u$. Said another way, at a boundary point $\bar{u}$, the normal cone of the sublevel set of a smooth function $g$ is given by $\{-\lambda\nabla g(\bar{u})\colon \lambda \geq0\}$. So we can write
$$N_{\{g\leq 0\}}(x) = \begin{cases}0 & g(x)<0\\ \{-\lambda\nabla g(x)\colon \lambda \geq 0\} & g(x)=0\\ \emptyset & g(x)>0 \end{cases}$$
Let $\bar{x}$ be a solution of the optimization problem. Then $\bar{x}$ must be feasible, i.e., $\bar{x} \in \{g\leq 0\}$. Thus, by the optimality conditions, we have either
$$-\nabla f(\bar{x})\in N_{\{g\leq 0\}}(\bar{x})\implies \begin{cases}\nabla f(\bar{x}) = 0\quad &\mbox{if }\quad g(\bar{x})<0\\ \nabla f(\bar{x})= \lambda^\star\nabla g(\bar{x})\quad &\mbox{if }\quad g(\bar{x})=0\end{cases}$$
for some $\lambda^\star\geq 0$, which we interpret to mean: if the constraint is active at $\bar{x}$ then $\nabla f(\bar{x})$ is proportional to $\nabla g(\bar{x})$, otherwise $\nabla f(\bar{x})=0$.
For the unconstrained case, we can use the same problem template but with $g\equiv 0$ which gives $\{g\leq 0\} = \mathbb{R}$ and $\nabla g(x)=0$ for all $x\in\mathbb{R}$ as well. Thus, in terms of the above optimality conditions, we are always in the case $g(\bar{x})=0$ and we have
$$\nabla f(\bar{x})=\lambda^\star\nabla g(\bar{x})=0$$
Thus, at a minimizer $\bar{x}$, the gradient of $f$ must be $0$ in the unconstrained case.
Depending on what you've seen in class/homeworks, you might also want to analyze the case with equality constraints:
$$\min\limits_{x:g(x)\leq 0, h(x)=0}f(x).$$