Chain rule with a constant gradient

54 Views Asked by At

I recently read a paper that made the following claim. For functions $f$ and $g$ and some variables $\mathbf{x} = (x_1, \cdots, x_N)$, suppose we have the following expression:

$$h(\mathbf{x}) = f(\sum_i g(x_i)).$$

Now the paper used the chain rule of calculus to show:

$$\frac{\partial h(\mathbf{x})}{\partial g(x_j)} = \frac{\partial h(\mathbf{x})}{\partial \sum_i g(x_i)}\frac{\partial \sum_i g(x_i)}{\partial g(x_j)} = \frac{\partial h(\mathbf{x})}{\partial \sum_i g(x_i)}.$$

This is fine so far, but next they argued that since this quantity is constant for all $j$, then the expression $h(\mathbf{x})$ has constant gradient with respect to the value each $x_j$ takes.

This seemed reasonable to me at first but then I calculated that:

$$\frac{\partial h(\mathbf{x})}{\partial x_j} = \frac{\partial h(\mathbf{x})}{\partial \sum_i g(x_i)}\frac{\partial \sum_i g(x_i)}{\partial g(x_j)}\frac{\partial g(x_j)}{\partial x_j},$$

And this seems to not be constant for different values among the $x_j$'s.

My question is am I correct to claim this is an error in the paper?