I was reading this wikipedia page on gradient descent (section: Solution of non-linear system) when I came across this formula:
$\nabla F(\mathbf {x} ^{(0)})=J_{G}(\mathbf {x} ^{(0)})^{\mathrm {T} }G(\mathbf {x} ^{(0)})$
How did this equation come about? Sorry for my shaky calculus if this sounds stupid, but this is the only point in this section that I don't understand.
From the article
$$ F({\bf x}) = \frac{1}{2}G^T({\bf x}) G({\bf x}) = \frac{1}{2}\sum_j G_j({\bf x}) G_j({\bf x}) \tag{1} $$
Take the derivative w.r.t to $x_i$
\begin{eqnarray} \frac{\partial F}{\partial x_i} &=& \frac{1}{2}\sum_j \frac{\partial}{\partial x_i}\left[G_j({\bf x}) G_j({\bf x}) \right] = \sum_j \color{blue}{\frac{\partial G_j({\bf x})}{\partial x_i}} G_j({\bf x}) \\ &=& \sum_j \color{blue}{[J_G({\bf x})]_{ij}} G_j({\bf x}) = [J_{G}({\bf x}) G({\bf x})]_i \tag{2} \end{eqnarray}
where the $(i,j)$-th component of matrix $J_G$ is defined as
$$ [J_G({\bf x})]_{ij} = \frac{\partial G_j({\bf x})}{\partial x_i} \tag{3} $$
Eq. (2) tells you what the $i$-th component of the gradient is, if you put them all together you get
$$ \nabla F({\bf x}) = J_G({\bf x}) G({\bf x}) \tag{4} $$