I have a dataset, that only takes integer values ($x$ and $y$ coordiantes). E.g. my data is the following: $x = (1,2,2,3,3), y = (1,2,3,3,4)$. I want to make a linear regression through the data, i.e. $y = -0.07+1.21x$. However this function doesn't make sense, as the values are not integers. So instead I rather want an integer valued "linear" function, that best approximates the data. That is I want a function of the form: $$f(x) = [m\cdot x+q]$$ Where $[\cdot ]$ is the nearest integer function. Of all these functions I want [the] one that minimizes $r^2 = \sum_i \left(y_i -f(x_i)\right)^2$
In my dataset $f(x) = [x]$ would be an optimal solution. However the underlying linear function is not unique, e.g. $f(x) = [x+0.1]$ will do equally good. Furthermore the function $f(x) = [1.4x]$ which is indeed different has the same $r^2$.
Is there a (hopefully analytical) way to find one function that minimizes the residuals?
In case the solution is not unique, it would be interesting to know, but not really necessary. Note that the functions $f_1 (x) = [x]$ and $f_2(x) = [x+0.1]$ are the same, if their domain is $\mathbb{Z}$.
As already proposed by @TZakrevskiy the solution is not unique.
I numerically found that
$$f(x) = [0.9x+b] $$
has with $0\leq b<0.6$ a total least square error of $2$.
Another solution is $$g(x) = [0.8x+b]$$
with $0.1\leq b \leq 6$ we again obtain the least square error of $2$.
By inspection, it is obvious that this value is the lowest value we can achieve, because we have two values at $x=2$ and $x=3$ that have different values (a distance of 1). This can be seen if you just try out all combinations of piecewise linear integer-valued functions that contain at least the first point. That means whatever we do we always get a least squares error of $2$ as long as we want integer values of $f(x)$.
Noting the previous fact we could also pick $f(x)=x$ and still get the same minimal error of $2$. But this function is simple and still has the same least square error of $2$ (formal determination coefficient of $R^2 \approx 0.615$; formal because the intercept is not included). Comparing this value to the Value achieved by linear least squares of $\approx 1.0714$ (determination coefficient of $R^2\approx 0.794$) seems to be acceptable.
Remark: If your data is from a dynamical experiment you could also consider to further investigate hysteresis effects which could be used to obtain a dynamical function that does contain all data values.