Minimizing a summation?

4.1k Views Asked by At

I have absolutely no idea how to approach this problem. I've been looking through notes, and I think I missed this when my professor discussed this in class.

$$ \text{Consider the data}\\ i\: x_i\: y_i\\ 1\:2\:1\\ 2\:3\:2\\ 3\:3\:3\\ 4\:4\:6\\ 5\:5\:5\\ \text{As discussed in class, compute the line } y=p(x)mx+b \text{ that minimizes}\\ F(m,b) = \sum_{i=0}^{5}(y_i-(mx_i+b))^2 \\ \text{You find that}\\ m=\text{____}\\ b=\text{____}\\ F(m,b)=\text{____} $$

We've been working on maximizing and minimizing functions by both partial derivatives testing for critical points and using lagrange multipliers to find max and min values, but I have no idea what to do with this to get my initial equations to work with. Can someone give me some kind of hint or instructions on what to do?

2

There are 2 best solutions below

0
On BEST ANSWER

I found where it was in my notes. We went over it very briefly, so it was a small section. Here's what you need to do.

$$ \frac{\partial{f}}{\partial{m}} = \sum_{i=1}^{5}2(y_i-mx_i-b)*(-x_i)\\ = -2(\sum x_iy_i-m\sum x_i^2-b \sum 1) = 0\\ \frac{\partial{f}}{\partial{b}} = -2(\sum y_i -mx_i-b)\\ -2(\sum y_i -m\sum x_i-b\sum i) = 0\\ \text{now we do some algebraic manipulations}\\ m\sum x_i^2 + b\sum x_i = \sum x_iy_i\\ m\sum x_i+b\sum1=\sum y_i\\ \text{and now we can solve for parts of these equations}\\ \sum_{i = 1}^{5} 1=5\\ \sum x_i = 17\\ \sum x_i^2 = 63\\ \sum y_i = 17 \\ \sum x_iy_i = 66\\ \text{We now get two equations with two variables we can solve for}\\ Eqn 1: 63m + 17b = 66\\ Eqn 2: 17m + 5b = 17\\ \text{Solve for b in Eqn 2 yields}\\ b = \frac{17}{5}(1-m)\\ \text{Plug into Eqn 1 and solve for m}\\ m = \frac{41}{26} \\ \text{Which makes b}\\ b = \frac{17}{5}-\frac{17*41}{5*26} $$

For the final portion, $F(m,b)$ you just need to plug in all the values and do the addition.

2
On

Let our best fit line be described as $mx + b$

We would hope that we could solve for $m$ and $b$ without any contradictions using the system of equations:

$\begin{cases} 2m+b = 1\\ 3m + b = 2\\ 3m + b = 3\\ 4m + b = 6\\ 5m + b = 5\end{cases}$

Which can be expressed as the following matrix equation:

$\begin{bmatrix} 2 & 1\\3&1\\3&1\\4&1\\5&1\end{bmatrix}\begin{bmatrix}m\\b\end{bmatrix} = \begin{bmatrix}1\\2\\3\\6\\5\end{bmatrix}$

You will unfortunately find that this system is inconsistent and no exact solution can be found ($3m+b=2$ and $3m+b=3$ simultaneously is impossible). However, a least squares fit can be found. To do so, multiply on the left by the transpose of the first matrix.

$\begin{bmatrix}2&3&3&4&5\\1&1&1&1&1\end{bmatrix}\begin{bmatrix}2&1\\3&1\\3&1\\4&1\\5&1\end{bmatrix}\begin{bmatrix}m^*\\b^*\end{bmatrix} = \begin{bmatrix}2&3&3&4&5\\1&1&1&1&1\end{bmatrix}\begin{bmatrix}1\\2\\3\\6\\5\end{bmatrix}$

This system, in the form $A^TAx=A^Tb$, will be consistent and the solution (or solution set in the case of infinitely many solutions) will be the best fit line. Solve the system either by row-reduction, or if $A^TA$ is invertible as $x=(A^TA)^{-1}A^Tb$, or using your other favorite technique.

(Note: it is as mentioned possible for infinitely many possibilities to occur, in particular when $(A^TA)$ is non-invertible.)

In this case it will be invertible, so we take the easy route and complete the problem as:

$\begin{bmatrix}m^*\\b^*\end{bmatrix}=\left(\begin{bmatrix}2&3&3&4&5\\1&1&1&1&1\end{bmatrix}\begin{bmatrix}2&1\\3&1\\3&1\\4&1\\5&1\end{bmatrix}\right)^{-1}\begin{bmatrix}2&3&3&4&5\\1&1&1&1&1\end{bmatrix}\begin{bmatrix}1\\2\\3\\6\\5\end{bmatrix}$

$=\begin{bmatrix}\frac{5}{26}&-\frac{17}{26}\\ -\frac{17}{26}&\frac{63}{26}\end{bmatrix}\begin{bmatrix}66\\17\end{bmatrix}=\begin{bmatrix}\frac{41}{26}\\-\frac{51}{26}\end{bmatrix}$

Giving us that our best fitting line will be of the form $\frac{41}{26}x-\frac{51}{26}$


Alternatively, approaching via another method, expand $F(m,b)$ and try to calculate its minimum value using derivatives.

$F(m,b)=(1-2m-b)^2 + (2-3m-b)^2 + (3-3m-b)^2 + (6-4m-b)^2 + (5-5m-b)^2$

$=5b^2+34bm-34b+63m^2-132m+75$

We try to calculate critical points where $F_{m}=F_{b}=0$, and test $D=F_{m,m}F_{b,b} - F_{m,b}^2$ to see if they are maximums, minimums, or saddle points.

$F_m = 34b + 126m - 132$

$F_b = 10b + 34m - 34$

Setting each equal to zero, we see that each forms a line in $\mathbb{R}^2$ (and they are not parallel) and so there will be only one critical point. Find the intersection of the line and verify that it is indeed a minimum.

These intersect at the point $m=\frac{41}{26}, b=-\frac{51}{26}$. After some tedious arithmetic, we confirm that $D>0$ and $F_{mm}>0$, implying that this point is indeed a minimum for the multivariable function $F(m,b)$.

Giving us that our best fitting line will be of the form $\frac{41}{26}x-\frac{51}{26}$


(double checked, both methods did in fact yield the same answer)