Partial derivative of $\beta^TX^TX\beta$ with respect to a component of $\beta$

45 Views Asked by At

Suppose $X$ is a $n$ by $p$ matrix and $\beta$ is a $p$ dimensional vector, I want to know if the partial derivative of $\beta^TX^TX\beta$ with respect to $\beta_i$ of $\beta$ given by $2 \sum_{j=1}^p(X_{j,i})^2\beta_i$, where $\beta_i$ is a component of $\beta$?

The approach I took assumes that $\frac{\partial Z}{\partial \beta_1} = \frac{\partial Z}{\partial \beta } \cdot \frac{\partial \beta}{\partial \beta_1}$, where $Z$ is $\beta^TX^TX\beta$. However, I am not sure if using the chain rule here in the context of matrix calculus is correct or not.

Any help is appreciated!

2

There are 2 best solutions below

2
On BEST ANSWER

You can compute this by expanding the product: $$ \begin{align*} \frac{\partial}{\partial \beta_i} [\beta^\top X^\top X \beta] &= \frac{\partial}{\partial \beta_i} \sum_{j, k} \beta_j [X^\top X]_{j, k} \beta_k \\ &= \sum_{j, k} \left( \delta_{i, j} [X^\top X]_{j, k} \beta_k + \beta_j [X^\top X]_{j, k} \delta_{i, k} \right) \\ &= \sum_k [X^\top X]_{i, k} \beta_k + \sum_j \beta_j [X^\top X]_{j, i} \\ &= 2 [X^\top X \beta]_i \end{align*} $$ since $[X^\top X]_{j, i} = [X^\top X]_{i, j}$ from the symmetry of $X^\top X$.

2
On

So you can use matrix/vector calculus in particular $$ \frac{\partial(\mathbf{x}^T\mathbf{A}\mathbf{x})}{\partial \mathbf{x}} = \left(\mathbf{A} + \mathbf{A}^T\right)\mathbf{x} $$ so if we define $\mathbf{\beta} = \mathbf{x}$ and $\mathbf{A} =\mathbf{X}^T\mathbf{X}$ we have $$ \frac{\partial}{\partial \beta}\beta^TX^TX\beta = (X^TX + (X^TX)^T)\beta $$ where if we have a symmetric matrix this becomes e.g. $(X^TX)^T = X^TX$ $$ \frac{\partial}{\partial \beta}\beta^TX^TX\beta = 2X^TX\beta $$