I'm taking Coursera Machine learning course. so who take this courses will able to help this problem.
this is the octave code to find the delta for gradient descent.
theta = theta - alpha / m * ((X * theta - y)'* X)';//this is the answerkey provided
First question) the way i know to solve the gradient descent theta(0) and theta(1) should have different approach to get value as follow
theta(0) = theta(0) - alpha / m * ((X * theta(0) - y)')'; //my answer key
theta(1) = theta(1) - alpha / m * ((X * theta(1) - y)')'; //my answer key
but i'm not sure why the answer key only show
theta = theta - alpha / m * ((X * theta - y)'* X)';
this equation.
Second question) what is the ' ' doing in octave code?
theta = theta - alpha / m * ((X * theta - y)'* X)';
'* X)' // what ' ' thing do in here
Transpose here is used for matching the columns of the X with rows of theta. Ex: size of X=97x2; y=97x1; theta=2x1;
first calc is X * theta. The size of the resulting matrix will be 97x1. Then, the sub of two same size matrices. Now, we have to multiply X with the matrix obtained from the previous step. But, the sizes are different (97x1) * (97x2)
Thus transposing the first matrix makes multiplication possible. This results in a new matrix of size 1x2 (row vector). But, theta is of size 2x1 (column vector). Hence the final transpose.