First of all interpolation definition is:
interpolation is a type of estimation, a method of constructing (finding) new data points based on the range of a discrete set of known data points.
Over the course of my university years, I've been using interpolation, especially Lagrange interpolation polynomials a lot. But I cannot stop thinking that, isn't it somehow arbitrary? We go out and make a real-world observation, and when we somehow do not have enough data points, or we are missing some data points, we create new ones by means of some pre-defined deterministic algorithms. We take the past data points, multiply some data points together, add them up, and subtract them to have some new data. But once we've done it, aren't we solving another problem that has other parameters than the actual problem we had in the first place? The assumption here is that our observations are somewhat continuous and we can get really close to a data point simply by looking at the other points. Is that necessarily the case? I know things generally do not change abruptly but they do not have to be continuous also. Maybe you'll say that this is what we've got when we don't have enough observations.
Then the question becomes, aren't our interpolation techniques somewhat arbitrary? Can't I come up with another technique (simply take other data points, or make your polynomials higher degree) to make the new points? And is there any rigorous mathematical work to show how good these techniques are and how much closer can they get to real data points, how meaningful these calculations are?
Interpolation makes sense if you have reasons to assume that the functional that relates the data to the parameters has "sufficiently smooth" variations. The "knowledge" that the function is smooth is in itself some additional information that one has on the problem. E.g., interpolating the position of an object at some time point given its position at other time points can give good results if there is no "jitter" in the motion at a timescale shorter than the sampling.
Depending how regular (smooth), different interpolation techniques are more or less appropriate. The measurement noise is another aspect that sometimes needs to be taken into account.
In the absence of noise, for Lagrange, consider the remainder formula: the bound is $\frac{(x_k-x_0)^{k+1}}{(k+1)!}\max_{x_0 \leq \xi \leq x_k} |f^{(k+1)}(\xi)|$, so it depends on the maximum absolute value of the $k+1$th order derivative: "sufficiently smooth" in this case means that the function should have its derivatives bounded by a constant up to that order. This is rarely the case in practice for large $k$, which means that the quality of Lagrange interpolation can then be poor.