Starting to read a book and the author goes into least squares etc. and shows this pic.
Thought I'd do it myself. Using the lin alg formula, there I noticed I was getting slightly different results, and after experimenting around a little I found depending on the constraints/parameters I chose they sometimes were completely wrong.
Could it be the formula $param = (A^TA)^{-1}A^Tb$ only really works on paper, or better said not in a digital domain where I am using a finite set of samples? And one - on the PC - has to revert back to taking the derivative of the error function and then using something like newton's method to find the optimal parameters?
I could post my code (python), but I'm pretty sure I'm doing everything right, it's just the functions I get are weird and the only explanation I can see is that it's an analog<->digital thing.
x_axis = np.linspace(-10,10,50)
#x_axis=np.linspace(-50,50,1000)
def create_data(num_param):
v_x=x_axis
y=np.sin(v_x)
err=np.array([(int(10*random())-5)/15 for i in range(len(y))])
y+=err
A=[]
for x in v_x:
A.append([x**i for i in range(num_param)])
A=np.matrix(A)
return A,y
def poly_func(x,param):
return sum(param[i]*(x**i) for i in range(0,len(param)))
def main():
A,y=create_data(10)
plot_old_y=y
param=(A.T*A).I*A.T*np.matrix(y).T
param=np.ravel(np.array(param))
print(param)
v_x=np.ravel(np.array(A[:,1]))
new_y=poly_func(v_x,param)
plt.figure(1)
plt.plot(x_axis,np.sin(x_axis),color='r')
plt.plot(v_x,new_y ,color='b')
plt.scatter(v_x,plot_old_y)
plt.show()
main()
Two example images, note if I zoomed out in the 'good' fit it would turn into what looks like a logistic function.
The first thing that tipped me off was that unlike his picture I was getting completely different fits at the orders he had, that let to me experimenting more.
Your implementation is correct. Indeed, even in your pictures show that it is regressing on sin. Some things I noticed, however:
as one commentor already pointed out, the windows your using are just too wide for you to see the results you are looking for. Try keeping things to a np.linspace(0,1,100), and adjust your sin function so you have $\text{sin}(2\pi*x)$, that way your sin function gets though a whole period in $[0,1]$.
You have different data that you are regression on than the book does. In fact, your data comes from sin +/- your error formula, and you have one such data point for every single point in your space. Don't get me wrong here, there is no such thing as too much data, but since the space is saturated with your generated points, and since your error formula isn't actually gaussian noise, you're getting actually learning the shape of sin + your psuedo-random generator instead of sin. Try to have fewer points to learn from, say, just 10.