How to solve this partial derivative equation ?

67 Views Asked by At

Using the following model

$$ H_\theta(X)=\theta^TX $$ Where $\theta $ is a vector of parameters.

The cost function is, $$ J(\theta)=\dfrac{1}{2m}\sum_{i=0}^{m}(h_\theta(X^i)-y^i)^2 $$ Now given $$ \dfrac{\delta J(\theta)}{\delta\theta} $$

Show that

$$ \theta=(X^TX)^{-1}X^Ty $$

Can anyone give me any solution of it?

1

There are 1 best solutions below

0
On

This is simply the standard solution to a linear least squares problem, the solution to the normal equations.

Let $X\in\mathbb{R}^{m\times n}$, $y\in \mathbb{R}^{m\times 1}$, $\Theta\in\mathbb{R}^{n\times 1}$. The energy is $$ J(\Theta) = \frac{1}{2m}||X\Theta - y||_2^2=\frac{1}{2m}\left[ \Theta^TX^TX\Theta - 2y^TX\Theta + y^Ty \right] $$ Then the first variation is $$ \frac{\delta J}{\delta \Theta} = \frac{1}{2m} \left[ \frac{\delta}{\delta \Theta}(\Theta^TX^TX\Theta) - 2 \frac{\delta}{\delta \Theta}(y^TX\Theta) \right] $$ Let's compute these inner vector derivatives using components: \begin{align} \frac{\partial}{\partial \Theta_j}(y^TX\Theta) &= \frac{\partial}{\partial \Theta_j}\sum_i \sum_k y_iX_{ik}\Theta_k=\sum_i y_iX_{ij}\\\therefore \frac{\delta}{\delta \Theta}(y^TX\Theta) &= y^TX\\ \frac{\partial}{\partial \Theta_j}(\Theta^TX^TX\Theta) &= \frac{\partial}{\partial \Theta_j}\sum_k\left[ \sum_i X_{ki}\Theta_i \right]^2\\ &= \sum_k 2\left[ \sum_i X_{ki}\Theta_i \right] X_{kj}\\ &=2\sum_k [X\Theta]_k X_{kj}\\ &= 2(X\Theta)^TX^j\\ \therefore \frac{\delta}{\delta \Theta}(\Theta^TX^TX\Theta) &= 2(X\Theta)^TX \end{align} Now equate the variation to zero: $$ \frac{\delta J}{\delta \Theta} =\frac{1}{2m} \left[ 2(X\Theta)^TX - 2 y^TX \right] =0 $$ Now it's pretty trivial to solve the equation: $$ \Theta^TX^TX=y^TX $$ $$ X^TX\Theta = X^Ty $$ $$ \Theta = (X^TX)^{-1}X^Ty $$