I have trouble understanding the proof which computes hessian of J to see if the optimisation problem is convex. why is the least square cost function for linear regression convex
The proof claims that the matrix $^T $ is positive semidefinite. It's obvious that the product is symmetric. But I am not able to see why it is positive semidefinite.
You can gather the definition of positive semi-definiteness from
https://en.wikipedia.org/wiki/Definiteness_of_a_matrix
We have to prove that:
$v^T X^T X v$ $\ge 0$
Notice that, the above expression can be rewritten as:
$v^T X^T X v$ = $(Xv)^T. Xv$ = $|| X.v ||_2 $ (Which is the euclidean norm of $X.v$)
Since, the euclidean norm of the vector is a sum of squares the result follows.