I need to fit linear lines with three parameters to a set of data. An example set of data is displayed here. (The actual values of the data don't matter so much as the algorithm to solve the problem since I need to apply this method to multiple data sets)
The form of the fit must be:
- For the circles: $y = ax + b$
- For the squares: $y = ax + b + c$
- For the triangles: $y = ax + b - c$
What is the best way to preform this linear regression with the added $c$ constraint?
As iiivooo commented, you need to add a categorical variable to describe the full data set.
The problem statement almost tells you what to do : the data points being $(x_i,y_i)$, for each data point add a parameter $z$ such that $z_i=0$ if the data point corresponds to a circle, $z_i=+1$ if the data point corresponds to a square, $z_i=-1$ if the data point corresponds to a triangle.
So, the data points are now $(x_i,y_i,z_i)$ and you need to perform the multilinear regression $$y=a +bx+cz$$ If you use matric calculations, the problem is simple. If you prefer the so-clled normal equations, they will then be $$\sum_{i=1}^n y_i=na+b\sum_{i=1}^n x_i+c\sum_{i=1}^n z_i$$ $$\sum_{i=1}^n x_iy_i=a\sum_{i=1}^n x_i+b\sum_{i=1}^n x_i^2+c\sum_{i=1}^n x_iz_i$$ $$\sum_{i=1}^n z_iy_i=a\sum_{i=1}^n z_i+b\sum_{i=1}^n x_i z_i+c\sum_{i=1}^n z_i^2$$ you have to solve for $a,b,c$.
Changing notations, you would then have as solutions :
$$c=\frac{-n \,\text{Sxx}\, \text{Szy}+n \,\text{Sxy}\, \text{Sxz}+\text{Sx}^2\, \text{Szy}-\text{Sx}\,\text{Sxy}\, \text{Sz}-\text{Sx}\, \text{Sxz}\, \text{Sy}+\text{Sxx}\, \text{Sy}\, \text{Sz}}{-n \,\text{Sxx} \,\text{Szz}+n \, \text{Sxz}^2+\text{Sx}^2\, \text{Szz}-2 \text{Sx}\, \text{Sxz} \,\text{Sz}+\text{Sxx}\, \text{Sz}^2}$$ $$b=\frac{c \,n \,\text{Sxz}-c \,\text{Sx} \,\text{Sz}-n\, \text{Sxy}+\text{Sx}\, \text{Sy}}{\text{Sx}^2-n\, \text{Sxx}}$$ $$a=\frac{-b\, \text{Sx}-c\, \text{Sz}+\text{Sy}}{n}$$