How to take the derivative of $f(\lambda x + (1-\lambda)y)$ w.r.t to $\lambda$ for real valued $f$ but $x$ of dimension $n$

1k Views Asked by At

$$f(\lambda x + (1-\lambda)y)\le \lambda f(x) + (1-\lambda) f(y)$$

I need to take the derivative with respect to $\lambda$ on both sides. I'm having problems because even though this function is real valued, $x$ and $y$ are of dimension $n$.

If I were to take the derivative with respect to $x$ or $x(\lambda)$ I'd know how to proceed. I know that what I'm actually taking the derivative of is $f(g(\lambda)$ but $g$ is not real valued.

1

There are 1 best solutions below

0
On BEST ANSWER

When taking derivatives involving vectors, I find it easiest to write out everything component-wise to see what is going on before learning the shortcuts. To start $$g:=f(\lambda\boldsymbol{x}+(1-\lambda)\boldsymbol{y})=f(\lambda x_1+(1-\lambda)y_1,\cdots,\lambda x_n+(1-\lambda)y_n)$$ Now for simplicity, let's define $a_i=\lambda x_i+(1-\lambda)y_i$ and note the chain rule when taking total derivatives $$\frac{dg}{d\lambda}=\frac{\partial f}{\partial a_1}\frac{\partial a_1}{\partial\lambda}+\cdots+\frac{\partial f}{\partial a_n}\frac{\partial a_n}{\partial\lambda}$$ Now that we have an equation written for the total derivative in lambda, we can essentially plug and chug. Noting that $\frac{\partial a_i}{\partial\lambda}=x_i-y_i$, we can see that $$\frac{dg}{d\lambda}=\frac{\partial f}{\partial a_1}(x_1-y_1) + \cdots + \frac{\partial f}{\partial a_n}(x_n-y_n)$$ Last step is to notice that we can write this as a dot product with the grad operator $$\boxed{\frac{dg}{d\lambda}=\nabla f(\lambda\boldsymbol{x}+(1-\lambda)\boldsymbol{y})\cdot(\boldsymbol{x}-\boldsymbol{y})}$$ For the right hand side, they are all scalars or scalar-valued functions, so the derivative is much more trivial. $$\frac{d}{d\lambda}\left[\lambda f(\boldsymbol{x})+(1-\lambda)f(\boldsymbol{y})\right]=f(\boldsymbol{x})-f(\boldsymbol{y})$$ Remark: This exercise is important in convexity. Also, notice that in one dimension for $x$ and $y$, this would reduce to $$f'\sim\frac{f(x)-f(y)}{x-y}$$