How does dot product work in matrix algebra?

44 Views Asked by At

I am working on a weighted minimization problem. Without the weights, the error function can be expressed as $e^T e$. With weights, $e$ first need to element-wise multiple by $w$, then the same formula applies: $(w \circ e)^T (w \circ e)$. How do I express it in pure matrix form (without the $\circ$). The $\circ$ operation is giving me a lot of trouble in trying to derive a derivative of a chained function on a set of parameters. It would be better if it's a matrix whose diagonal is $w_i e_i$, and 0 elsewhere; or a vector of $w_i e_i$.

For the weighted minimization problem, I have $$g = e^T e, \; e_i = w_i u_i, \; u = h(X)$$ where $$u, w, e \in \mathbb{R}_{m}, \; X \in \mathbb{R}_{n}, \; g: \mathbb{R}_{m} \rightarrow \mathbb{R}_{1}, \; h: \mathbb{R}_{n} \rightarrow \mathbb{R}_{m} $$ I want to find $\frac{dg(X)}{dX}$. I think this should be in $\mathbb{R}_{n}^T$. Applying the chain rule in the single variable manner, $$ \frac{dg(X)}{dX} = 2 e \frac{de}{du} \frac{du}{dX} $$ $$ \frac{dg(u)}{du} \in \mathbb{R}_{m}^T, \; \frac{du}{dX} \in \mathbb{R}_{mn} $$ The sizes of the matrices don't foot because $e \frac{de}{du}$ should be $e \circ \frac{de}{du}$.

2

There are 2 best solutions below

1
On BEST ANSWER

One nice way to make your function into a matrix product is to define the diagonal matrix $$ W = \pmatrix{w_1&&\\&\ddots&\\&&w_n} $$ We then have $$ (w \circ e)^T(w \circ e) = (We)^T (We) = e^T W^TWe =\\ e^T \pmatrix{|w_1|^2&&\\&\ddots&\\&&|w_n|^2} e $$

0
On

Note that $$ (w \circ e)^\top (w \circ e) = e^\top W e$$ where $W = \mathrm{diag}^{-1}(w_1^2, \ldots, w_n^2)$ and $\circ$ denotes Hadamard (or Schur) product.