We want to find the line $y = mx + c$ that best "fits" the list of points $$(x_1, y_1), (x_2, y_2), ... (x_i, y_i), ... (x_n, y_n).$$ For each point there is no uncertiainty in $x$ and each $y_i$ has uncertainty $\sigma_i$
By minimizing the squares differences
$$[y_i - (mx_i + c)]^2$$
At first for $m$ and then for $c$ and then making a system we can find $m$ and $c.$ This I understand.
My doubt is how can we find uncertainties for $m$ and $c?$
My textbook just says that we can sum the partial derivatives of $m$ with respect to each $y_i$ multiplied by $\sigma_i^2$ (the uncertainty of the given $y_i)$ and performs the calculation in just one line.
Could anyone explain in more detail how to calculate such errors? I think seeing an example with 3 points would make me understand the concept without the trouble of too much notation.
Here is the calculation of my book that gives me trouble:
$$Var[m] = \sum (\frac{dm}{dY_i})^2*\sigma_{Y_i}^2 =$$ $$= (x_i/\sigma_(Y_i)^2- \overline x / \sigma_{Y_i}^2 )^2 * \frac{1}{Var[x]^2}*\frac{1}{\sum\frac{1}{\sigma_{Y_i}^2}} =$$ $$= \frac{1}{Var[x]*\sum\frac{1}{\sigma_{Y_i}^2}}$$
At the start of the second line my book uses $x_i$ outside of any summation symbol, I think that is an error.



Let $X$ denote the matrix with 1s in the first column and $x_i$ in the second column. Then $(m,c) = (X^TX)^{-1}X^Ty$. Let $r$ denote the first row of $(X^TX)^{-1}X^T$, then $$m=r^T y = \sum_i r_i y_i.$$ Assuming the uncertainties of $y_i$ are uncorrelated, you get $$\sigma^2(m)=\sum_i r_i^2 \sigma^2(y_i).$$ You can do the same for $c$ using the second row of $(X^TX)^{-1}X^T$.