Derivative Fisher's Criterion

251 Views Asked by At

I am currently reading "Pattern recognition and machine learning" by Bishop. In section 4.1.4 the Fisher's Criterion is stated as: $$J(w)=\frac{w^TS_Bw}{w^TS_Ww}$$ where $w$ is a column vector, and $S_B$ and $S_W$ are symmetric matrices. I now want to calculate the derivative of that function, but only came this far using the quotient rule: \begin{align} \frac{dJ(w)}{dw} &=\frac{(w^TS_Bw)'w^TS_Ww-w^TS_Bw(w^TS_Ww)'}{(w^TS_Ww)^2} \end{align}

How do I calculate $(w^TS_Bw)'$ and $(w^TS_Ww)'$ ?

I do get it's ought to be $S_B w$ and $S_W w$, but I do not get why, especially because it is stated here: http://www.cs.huji.ac.il/~csip/tirgul3_derivatives.pdf that $\frac{\partial x^T A x}{\partial x} = x^T (A + A^T)$ when $x$ is a column vector and $A$ is a matrix.

1

There are 1 best solutions below

2
On

In Bishop the convention is that the derivative of a scalar function with respect to a vector variable is a column vector. This can be seen in one of the formulas in his Matrix Derivatives section: $$ {\partial\over \partial {\bf x}}({\bf x}^T{\bf a}) ={\partial\over \partial {\bf x}}({\bf a}^T{\bf x}) ={\bf a}\tag{C.19} $$ By this convention, you can prove that if $\bf A$ is a (conformable) constant matrix, then $$ {\partial\over \partial {\bf x}}({\bf x}^T{\bf Ax})=({\bf A} + {\bf A}^T){\bf x} $$ To get your result you use the fact that both $S_B$ and $S_W$ are symmetric matrices, so they are equal to their transposes.