Why transpose in gradient descent?

2k Views Asked by At

I've been following Andrew NG's Machine learning course. According to the gradient descent formula image in the link : (There's no transpose sign)

However, in the Implementation, the answer key shows this:

theta = theta - sum((h - y) .* X)'

why the transpose sign in the answer key?

1

There are 1 best solutions below

2
On BEST ANSWER

In general, if you call

$$ {\bf x} = \pmatrix{x_1 \\ x_2 \\ \vdots \\ x_n} ~~~\mbox{and}~~~ {\bf y} = \pmatrix{y_1 \\ y_2 \\ \vdots \\ y_n} $$

then

$$ {\bf x}^T{\bf y} = \pmatrix{x_1 & x_2 &\cdots & x_n}\pmatrix{y_1 \\ y_2 \\ \vdots \\ y_n} = x_1y+1 + x_2y_2 + \cdots + x_n y_n = \sum_{k=1}^nx_k y_k $$

The difference is then in whether you are using components (rightmost expression) or vectors (leftmost expression), both both are the same