Notation for residual sum of squares $\sum_{i=1}^{n} (y_i - f(x_i))^2$

34 Views Asked by At

I’m reading the The Data Science Design Manual by Skiena and I need help with the notation on page 270.

Linear regression seeks the line $y = f(x)$ which minimizes the sum of the squared errors over all the training points, i.e., the coefficient vector $w$ that minimizes \begin{align} \sum_{i=1}^{n} &(y_i - f(x_i))^2 \tag{1} \\ \textrm{where} \quad f(x) &= w_0 + \sum_{i=1}^{m-1}w_ix_i \tag{2} \end{align} Suppose we are trying to fit a set of $n$ points, each of which is $m$ dimensional. The first $m-1$ dimensions of each point is the feature vector $(x_1, \dots, x_{m-1})$, with the last value $y = x_m$ serving as the target or dependent variable.

My question: What is the difference between $f(x)$ and $f(x_i)$? It says “where” so shouldn’t (2) be $f(x_i)$?

I have seen a similar notation as in (1) before and I understand it if it is written as: \begin{align} \sum_{i=1}^{n} & (y_i - \hat y_i)^2 \tag{3} \\ \textrm{where} \quad \hat y_i &= a + b x_i \tag{4} \end{align} But I don’t follow the notation $f(x_i)$ and $f(x)$.