I have a problem regarding how to vectorize, more specifically the problem below:
Repeat { $$\theta_j := \theta_j - \frac{\alpha}{m} \sum\limits_{i=1}^m (h_\theta(x^{(i)}) - y^{(i)})x_j^{(i)}$$ }
Vectorized to:
$$\theta := \theta - \frac{\alpha}{m}X^T (g(X\theta)-\vec{y})$$
I can't seem to figure out, how to go from the top equation to the bottom. More specifically why the expression $h_\theta(x^{(i)})$ changes into $g(X\theta)$. Furthermore I assume that the summation and $x_j$ are substituted by $X^T$.
Let H denote the hypothesis matrix,
$$ \begin{array}{l}{\dot{H}_{m \times 1}:=\left(\begin{array}{c}{h_{\theta}\left(x^{1}\right)} \\ {h_{\theta}\left(x^{2}\right)} \\ {\vdots} \\ {h_{\theta}\left(x^{m}\right)}\end{array}\right)=g(X \theta})\end{array} $$
Let E denote the error matrix,
$$ E_{m \times 1}:=\left(\begin{array}{c}{e_{1}} \\ {e_{2}} \\ {\vdots} \\ {e_{m}}\end{array}\right)=\left(\begin{array}{c}{h_{\theta}\left(x^{1}\right)-y^{1}} \\ {h_{\theta}\left(x^{2}\right)-y^{2}} \\ {\vdots} \\ {h_{\theta}\left(x^{m}\right)-y^{m}}\end{array}\right)_{m \times 1}=H-\vec{y} $$
Let $ \delta $ denote the summation term
$$ \begin{aligned} \delta_{j} &:=\frac{1}{n} \sum_{i=1}^{n}\left(h_{\theta}\left(x^{(i)}\right)-y^{(i)}\right) x_{j}^{(i)} \\ &=\frac{1}{m}\left(e_{1} x_{j}^{1}+e_{2} x_{j}^{2}+\ldots+e_{m} x_{j}^{m}\right) \\ &=\frac{1}{m} x_{j}^{\top} E \\ \end{aligned} $$
$$ \begin{array}{c}{\delta:=\left(\begin{array}{c}{\delta_{0}} \\ {\delta_{1}} \\ {\vdots} \\ {\delta_{n}}\end{array}\right)=\frac{1}{m} X^{\top} E}\end{array} $$
$$ {\theta=\theta-\alpha \delta} $$ When you expand $ \delta $, you will obtain the second equation. Note that this vectorised form applies for linear regression too, as they have the same gradient descent formula with a different hypothesis function. Thus for linear regression, simply substitute $ g(X\theta) $ with $ X\theta $.