Given the following:
data $(x_{data},y_{data})$, where x values may be measured multiple times.
For example,
$$ x_{data} = ( 1, 2, 2, 3,3,3, 4 ) $$
$$ y_{data}=(8,10,9,1,2,1.5,−7) $$
And a fitting function that's linear in fit parameters
For example, $$f(x) = a + b \cos(x) + c \sin(x)$$
This function is not necessarily linear in the variables. (i.e it's not necessarily $g(x) = m*x + c$) .
I could perform a fit a few ways:
- Directly on the data $(x_{data},y_{data})$
- On the average of each y for a given x. In the example: $$x_{avg} = (1,2,3,4)$$ and $$y_{avg} = (8,9.5,1.5,4)$$
- Or on the averaged data set with the same number of points $x_{data}$ and $$y_{avgSpaced} = (8,9.5,9.5,1.5,1.5,1.5,4)$$
My Guess
I feel like method 2 will result in a different fit from 1 and 3, because the weighting of the points is different. In the example, x=3 matters less if you do method 2 than method 1 or 3. However, I imagine a least squares regression with weighting would make method 2 equivalent to method 3. And my intuition tells me that method 3 should be equivalent to method 1.