I am a mathematics noob with a data science problem, and I have been having trouble understanding enough about curve fitting to even be able to search for a solution.
I am trying to find a curve that best fits a data set comprising coordinates with overlapping X values. I have about 5 million data points of the following type:
{0, 1}, {0, 2}, {1, 4}, {2, 8}, ...}
The first value of these data points represents a timestamp in milliseconds, and can recur in the data set many times. How would I go about finding a curve for this type of data set?
First step is to plot the data, and guess the time dependence (maybe there is some physical meaning as well). If you would post the plot here, and what the data means, someone would possibly give you a hint.
Your function $f$ has some parameters $P_i$ with $i$ form 1 to $n$, so , if you use a least square approach, all you need to do is minimize $\sum_{i=1}^N (y_i-f(x_i,P1,P2,...))^2$ with respect to $P_1$, $P_2$,... It does not matter that you have multiple $y$ values for the same $x$.