Need a basic reminder regarding a simple vector calculus question

30 Views Asked by At

Hello I was recently working through the https://www.deeplearningbook.org/ and the author wrote the following;

In the context of a section on Principal Component Analysis,

Suppose $x=\{x^{(1)},x^{(2)},...x^{(m)}\}$ with each $ x \in \mathbb{R}^{n}$ For each $ x^{(i)} \in \mathbb{R}^{n}$ we want a corresponding code vector $c \in \mathbb{R}^{l}$ with $l \le n$. We want an encoder, $f(x)=c$ and a decoder $g$ so that $x$ approximates $g(f(x))$ We also constrain the columns of D to be orthogonal.

For $D \in \mathbb{R}^{n \times l}$, $c \in \mathbb{R}^{l}$ let $g(c)=Dc$ define the decoding

$$\nabla_{c}(-2x^{T}Dc+c^{T}c)=0$$

implies $c=D^{T}x$

My question is how does it go from $-2x^{T}Dc$ to $D^{T}x$ instead of $x^{T}D$

1

There are 1 best solutions below

3
On BEST ANSWER

Notice that $x$ is not a matrix, $x$ is a set of vectors and so $D^T x$ and $x^T D$ are the same set of vectors. Pedantically, in one case you have a set of column vectors and in the other a set of row vectors. However, being a row or column vector is usually only relevant when you are thinking of the set as a matrix. Vectors qua vectors don't care about row-ness or column-ness.