Show that $-\sum_{i=1}^k\frac{\nabla f_i(x)}{\left\|\nabla f_i(x)\right\|}$ is not necessarily a descent direction if $k>2$

88 Views Asked by At

Let $d,k\in\mathbb N$ and $f:\mathbb R^d\to\mathbb R^k$. Say that $v\in\mathbb R^d$ is a descent direction of $f$ at $x\in\mathbb R^d$ if $${\rm D}f(x)v<0\tag1$$ (component wisely).

Now assume that $x\in\mathbb R^d$ and there is at least one descent direction of $f$ at $x$.$^1$ Let $$v:=-\sum_{i=1}^k\frac{\nabla f_i(x)}{\left\|\nabla f_i(x)\right\|}.$$

If $k=2$, it can be easily shown that $v$ is a descent direction. How can we show that this doesn't need to be the case when $k>2$?

First of all, the crucial identity is \begin{equation}\begin{split}\left\langle v,\nabla f_j(x)\right\rangle&=-\sum_{i=1}^k\frac{\left\langle\nabla f_i(x),\nabla f_j(x)\right\rangle}{\left\|\nabla f_i(x)\right\|}\\&=-\left\|\nabla f_j(x)\right\|\left(1+\sum_{\substack{i=1\\i\ne j}}^k\cos\sphericalangle\left(\nabla f_i(x),\nabla f_j(x)\right)\right)\end{split}\tag2\end{equation} for all $j\in\{1,\ldots,k\}$.

Now, in order to obtain the desired claim, I've read that we simply need to consider any $l>2$ and $$g:\mathbb R^d\to\mathbb R^l\;,\;\;\;y\mapsto(f_1(y),f_2(y),\ldots,f_2(y)).$$ Now it is claimed that \begin{equation}\begin{split}\langle v,\nabla f_1(x)\rangle&=\langle v,\nabla g_1(x)\rangle\\&=-\left\|\nabla f_1(x)\right\|\left(1+(l-1)\cos\sphericalangle\left(\nabla f_1(x),\nabla f_2(x)\right)\right)\end{split}\tag3\end{equation} so that, if $\cos\sphericalangle\left(\nabla f_1(x),\nabla f_2(x)\right)<0$, there is a choice for $l$ which yields $\langle v,\nabla f_1(x)\rangle>0$.

I don't get that. The second identity in $(3)$ is clearly correct, but why does the first identity hold? And if this would be a valid argument, couldn't we apply this argument in general to show that there is never a descent direction for any function (which is clearly nonsense). In particular, why would this argument fail to hold for $k=2$?


$^1$ Note that this implies $\nabla f_i(x)\ne0$ for all $i\in\{1,\ldots,k\}$ and hence $v$ is well-defined.

1

There are 1 best solutions below

0
On BEST ANSWER

I don't get that. The second identity in (3) is clearly correct, but why does the first identity hold?

I do not see how $\langle v,\nabla f_1(x)\rangle =\langle v,\nabla g_1(x)\rangle$ can cause confusion. The way $g$ is constructed, $g_1(x) = f_1(x)$.

And if this would be a valid argument, couldn't we apply this argument in general to show that there is never a descent direction for any function (which is clearly nonsense).

The (valid) argument only shows that you can construct a function $g : \mathbb{R}^d \to \mathbb{R}^l$ that does not have a descent direction for some $l > 2$. The argument is about $g$, not about the function $f$ used in the construction of $g$.

Perhaps a small example provides insight. We should take two functions $f_1$ and $f_2$ such that $\cos\sphericalangle\left(\nabla f_1(x),\nabla f_2(x)\right)<0$. For example, $f_1(x) = x$, $f_2(x)=-x^2$ at $x=2$, so $\cos\sphericalangle\left(\nabla f_1(x),\nabla f_2(x)\right)=-1$. For the function $g$, we have

$$v:=-\sum_{i=1}^l\frac{\nabla g_i(2)}{\left\|\nabla g_i(2)\right\|}=-(1+(l-1)(-1))=l-2.$$ so $$\langle v,\nabla f_1(x)\rangle=\langle l-2,2\rangle = 2l-4.$$ Using (3) we get $$-\left\|\nabla f_1(x)\right\|\left(1+(l-1)\cos\sphericalangle\left(\nabla f_1(x),\nabla f_2(x)\right)\right)=-||2||(1+(l-1)(-1)),$$ which indeed simplifies to $2l-4$, and when you make $l$ sufficiently large, indeed $2l-4>0$ (for example $l=3$). This tells you that $v$ is not a direction of descent for for $g(x) = (x, -x^2, -x^2)$ or for $g(x) = (x, -x^2, -x^2, -x^2, -x^2, -x^2)$.