matrix calculus product rule confusion

272 Views Asked by At

According to The Matrix Cookbook,

The gradient of the product is $$\nabla_x(f(X)g(X))=f(X)\nabla_X g(X)+g(X)\nabla_X f(X).$$

But then $$ \nabla_x X^TAX = \nabla_x(X^TA) X+X^TA\nabla_xX$$ $$ =AX+X^TA $$

instead of the supposed answer $$(A+A^T)X$$

what's wrong ?

1

There are 1 best solutions below

0
On

Assume that $x\in \mathbb{R}^n$ Let $f:x\rightarrow x^TAx$. The derivative is $Df_x:h\rightarrow h^TAx+x^TAh=(x^TA^T+x^TA)h$. The gradient is defined by (*) $<\nabla(f)(x),h>=Df_x(h)$, that is $(\nabla(f)(x))^Th=(x^TA^T+x^TA)h$. Finally $\nabla(f)(x)=(A+A^T)x$.

When you write $f=f_1\times f_2$ in order to use the formula associated to a product, $f_1,f_2$ must be real valued (otherwise, you cannot use the scalar product (*) $<u,v>=u^Tv$). This is not the case when you write $x^TAx=(x^TA)\times x$.