In exercise 4.2 of Pattern Recognition and Machine Learning the official solution states that the derivative of $$E_D(\mathbf{\tilde W}) = \frac{1}{2}Tr\{(\mathbf{XW}+\mathbf{1}w_0^T - \mathbf{T})^T(\mathbf{XW}+\mathbf{1}w_0^T - \mathbf{T}) \}$$ with respect to $w_0$ (where $w_0$ is a column vector of bias weights and $\mathbf{1}$ is a column vector of N ones) is $$ 2Nw_0+2(\mathbf{XW-T})^T\mathbf{1} $$ but I do not know how they got to this result. Can you please explain how to get to this result as well as a link to a resource which explains this.
2026-04-23 02:13:27.1776910407
Derivative of trace of matrix expression with respect to a matrix
54 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail At
1
There are 1 best solutions below
Related Questions in CALCULUS
- Equality of Mixed Partial Derivatives - Simple proof is Confusing
- How can I prove that $\int_0^{\frac{\pi}{2}}\frac{\ln(1+\cos(\alpha)\cos(x))}{\cos(x)}dx=\frac{1}{2}\left(\frac{\pi^2}{4}-\alpha^2\right)$?
- Proving the differentiability of the following function of two variables
- If $f ◦f$ is differentiable, then $f ◦f ◦f$ is differentiable
- Calculating the radius of convergence for $\sum _{n=1}^{\infty}\frac{\left(\sqrt{ n^2+n}-\sqrt{n^2+1}\right)^n}{n^2}z^n$
- Number of roots of the e
- What are the functions satisfying $f\left(2\sum_{i=0}^{\infty}\frac{a_i}{3^i}\right)=\sum_{i=0}^{\infty}\frac{a_i}{2^i}$
- Why the derivative of $T(\gamma(s))$ is $T$ if this composition is not a linear transformation?
- How to prove $\frac 10 \notin \mathbb R $
- Proving that: $||x|^{s/2}-|y|^{s/2}|\le 2|x-y|^{s/2}$
Related Questions in MATRICES
- How to prove the following equality with matrix norm?
- I don't understand this $\left(\left[T\right]^B_C\right)^{-1}=\left[T^{-1}\right]^C_B$
- Powers of a simple matrix and Catalan numbers
- Gradient of Cost Function To Find Matrix Factorization
- Particular commutator matrix is strictly lower triangular, or at least annihilates last base vector
- Inverse of a triangular-by-block $3 \times 3$ matrix
- Form square matrix out of a non square matrix to calculate determinant
- Extending a linear action to monomials of higher degree
- Eiegenspectrum on subtracting a diagonal matrix
- For a $G$ a finite subgroup of $\mathbb{GL}_2(\mathbb{R})$ of rank $3$, show that $f^2 = \textrm{Id}$ for all $f \in G$
Related Questions in MATRIX-CALCULUS
- How to compute derivative with respect to a matrix?
- Definition of matrix valued smooth function
- Is it possible in this case to calculate the derivative with matrix notation?
- Monoid but not a group
- Can it be proved that non-symmetric matrix $A$ will always have real eigen values?.
- Gradient of transpose of a vector.
- Gradient of integral of vector norm
- Real eigenvalues of a non-symmetric matrix $A$ ?.
- How to differentiate sum of matrix multiplication?
- Derivative of $\log(\det(X+X^T)/2 )$ with respect to $X$
Trending Questions
- Induction on the number of equations
- How to convince a math teacher of this simple and obvious fact?
- Find $E[XY|Y+Z=1 ]$
- Refuting the Anti-Cantor Cranks
- What are imaginary numbers?
- Determine the adjoint of $\tilde Q(x)$ for $\tilde Q(x)u:=(Qu)(x)$ where $Q:U→L^2(Ω,ℝ^d$ is a Hilbert-Schmidt operator and $U$ is a Hilbert space
- Why does this innovative method of subtraction from a third grader always work?
- How do we know that the number $1$ is not equal to the number $-1$?
- What are the Implications of having VΩ as a model for a theory?
- Defining a Galois Field based on primitive element versus polynomial?
- Can't find the relationship between two columns of numbers. Please Help
- Is computer science a branch of mathematics?
- Is there a bijection of $\mathbb{R}^n$ with itself such that the forward map is connected but the inverse is not?
- Identification of a quadrilateral as a trapezoid, rectangle, or square
- Generator of inertia group in function field extension
Popular # Hahtags
second-order-logic
numerical-methods
puzzle
logic
probability
number-theory
winding-number
real-analysis
integration
calculus
complex-analysis
sequences-and-series
proof-writing
set-theory
functions
homotopy-theory
elementary-number-theory
ordinary-differential-equations
circles
derivatives
game-theory
definite-integrals
elementary-set-theory
limits
multivariable-calculus
geometry
algebraic-number-theory
proof-verification
partial-derivative
algebra-precalculus
Popular Questions
- What is the integral of 1/x?
- How many squares actually ARE in this picture? Is this a trick question with no right answer?
- Is a matrix multiplied with its transpose something special?
- What is the difference between independent and mutually exclusive events?
- Visually stunning math concepts which are easy to explain
- taylor series of $\ln(1+x)$?
- How to tell if a set of vectors spans a space?
- Calculus question taking derivative to find horizontal tangent line
- How to determine if a function is one-to-one?
- Determine if vectors are linearly independent
- What does it mean to have a determinant equal to zero?
- Is this Batman equation for real?
- How to find perpendicular vector to another vector?
- How to find mean and median from histogram
- How many sides does a circle have?
For typing convenience, define the auxiliary matrix $A$ as
$$\eqalign{ A= {\tt1}w_0^T + (XW -T) \\ A^T= w_0{\tt1}^T + (XW -T)^T }$$ and use a colon as an infix product notation for the trace function, i.e. $$A:B = {\rm Tr}(A^TB) = {\rm Tr}(B^TA) = B:A$$ Rewrite the objective function in a form which makes it easy to calculate the gradient $$\eqalign{ \phi &= \tfrac 12A:A = \tfrac 12A^T:A^T \\ d\phi &= A^T:dA^T = A^T:dw_0\,{\tt1}^T = A^T{\tt1}:dw_0 \\ \frac{\partial\phi}{\partial w_0} &= A^T{\tt1} = Nw_0 + (XW-T)^T{\tt1} \\ }$$ This is the same result as your notes except for the factor of ${\tt2}$, which I suspect is a typo.
The standard reference for this subject is Matrix Differential Calculus by Magnus and Neudecker.