How to calculate set of subgradients

4.8k Views Asked by At

I am trying to understand question 1(a) here. They calculate a subgradient at a given $x$ for the following convex function:

$$f(x) =\max_{i=1, ..., m} (a_i^Tx + b_i)$$

The solution is to find $k \in \{1,...m\}$ for which $f(x) = (a_k^T x + b_k)$, then the subgradient is $a_k$.

I understand the solution. However, it is only one solution. The size of the set of subgradients is infinite. Therefore, I wonder, what are all the other subgradients? How can I calculate other subgradients from the set?

Thanks.

1

There are 1 best solutions below

0
On

By a theorem of Danskin and Bertsekas (I call it "the Bertsekas-Danskin Theorem for subgradients", see link below), the subgradient of $f$ is the convex hull of all such $a_k$, and corresponds to a face of the polyhedron $\mathcal P_A := \text{conv}\{a_k | 1 \le k \le m\}$. I've proven a more general result here.

Precisely, you deduce that $$ \begin{split} \partial f(x) &= \partial \max_{1 \le i \le m}a_i^Tx_i + b_i = \partial \max_{y \in \Delta_m}y^T(A^Tx + b)\\ &= \text{conv}\{\nabla_x (\hat{y}^T(A^Tx + b))| \hat{y} \in \Delta_m, \hat{y}^T(A^Tx + b) =f(x)\}\\ &= \text{conv}\{A\hat{y}| \hat{y} \in \Delta_m, \hat{y}^T(A^Tx + b) = f(x)\} \\ &= \text{conv}\{a_k | 1 \le k \le m, a_k^Tx + b_k = f(x) \}, \end{split} $$ as claimed.