I know the gradient descent about $z=wx+b$. But how to implement the derivative values of $w$ and $b$ in Python? I see some example like
derivative_weight = (np.dot(x_train, ((y_head-y_train).T))) / x_train.shape[1]
derivative_bias = np.sum(y_head-y_train) / x_train.shape[1]
In mathematics is
$\frac{\partial J}{\partial w}=\frac{1}{m}x(y_{head}-y)^T$
$\frac{\partial J}{\partial b}=\frac{1}{m}\sum_{i=1}^{m}(y_{head}-y)$
How to get it? How are those formulas derived? Please feel free to help me. Thank you very much.