I am trying to use gradient descent to find two straight lines that fit a discrete set of points (2D). One obvious way is to use a brute force search algorithm (that is already implemented) but I am trying to improve the performance of the code. In reality, at this point, I am trying to better understand the underlying math.
My cost function is defined as follows:
$$ J(k) = \frac{1}{k}\sum_{i=1}^{k}(pred_i - y_i)^2+\frac{1}{n-k}\sum_{i=k}^{n}(pred_i - y_i)^2 $$
I want to find the point k where the two lines are connected, the next step would take the derivative of the cost function. But the $k$ parameters is in the summation limits. In this formulation, the two linear models are computed from the first and $k$ point for the first line and $k$ to the last point for the second line.
Is this possible, any ideas? Some preliminary tests showed that this cost function is not convex (at least it depends on the discrete set of points).
You want to minimize $$J(k)=\frac 1 k \sum_{i=1}^k(\hat y_i-y_i )^2+\frac 1{n-k}\sum_{i=k}^n(\hat y_i-y_i )^2$$ If you use ordinary least squares for straight lines, you have the symbolic values for each of the sums.
Now, use any integer programming solver to find $k$.
I have been working this kind of problems for years and never faced any serious difficulty.