I'm a bit confused about the implication of the following:
Suppose we are given a set of data points $(X_i, Y_i), i=1,2,...,n$, where $X_i$ is the predictor variable, and $Y_i$ is the response variable.
- In the context of statistical modelling, we are interested in expressing $Y$ as $$Y_i = s(X_i)+\epsilon_i$$
- This implies the assumption $$\mathbb{E}(Y_i|X_i=x_i)=s(X_i)$$
I understand that $\epsilon$ is a constant and that the expected value of a constant is zero. However, I do not understand why the expectation of $s(x)$ would be $s(x)$.
$$E[\epsilon_i|X_i]=0$$ is a common assumption in regression. $\epsilon$ is refers to the noise.
\begin{align} E(Y_i|X_i=x_i) &= E[s(X_i)+\epsilon_i |X_i=x_i] \\ &=E[s(X_i)|X_i=x_i] +E[\epsilon_i |X_i=x_i] \\ &=E[s(x_i)|X_i=x_i] +0 \\ &=s(x_i) \end{align}