Every continuous function $f : \mathbb R \to \mathbb R$ could be uniformly approximated by step functions. For a proof consider an interval $[a,b]$, then $f$ is bounded on this compact interval, i.e. we have $f([a,b]) \subseteq [c,d]$ for some $c, d \in \mathbb R$. Now for $n$ consider the inverse images $$ f^{-1}\left[ \left( c + \frac{i}{n} ( d - c) - \frac{1}{n}, c + \frac{i+1}{n} (d - c) + \frac{1}{n} \right)\right], \quad i = 0,\ldots, n-1. $$ Now each such set is open, as $f$ is continuous, and hence an union of open intervals. Further they cover $[a,b]$ (the $\pm 1/n$ correct for that), so by compactness we can choose a finite subcover $(a_1, b_1), \ldots, (a_m, b_m)$. Also assume they are disjoint, as the intersection of two open intervals gives an open interval, we can write their union as three disjoint open intevals, hence by using induction on the non-disjoint intervals this follows. Now suppose $x,y \in (a_j, b_j)$ for some $j$. Then $$ x,y \in (a_j, b_j) \subseteq f^{-1} \left[ \left( c + \frac{k}{n} ( d - c) - \frac{1}{n}, c + \frac{k+1}{n} (d - c) + \frac{1}{n} \right)\right] $$ for some $k \in \{ 0,\ldots, n-1 \}$. After some rearranging with inequalities this gives $$ |f(x) - f(y)| < \frac{1}{n} (d - c) + \frac{2}{n}. $$ Now define a step function $\varphi_n : [a,b] \to \mathbb R$ by $$ \varphi_n(x) = f\left( \frac{b_j - a_j}{2} \right) \textrm{ for the unique $a_j, b_j$ with } x \in (a_j, b_j) $$ and $\varphi_n(a_i) := f(a_j)$ for $i = 1,\ldots, m$, the above shows that $|| \varphi_n - f ||_{\infty} < \frac{1}{n} (d - c) + \frac{2}{n}$.
By using $\sigma$-finiteness of $\mathbb R$ we could extend this result from compact intervals to the whole line $\mathbb R$. $\square$
Now I am trying to extends the above proof principle (i.e. looking at inverse images and then using Heine-Borel/compactness of $[a,b]$) to functions with the property that for each $x \in \mathbb R$ the left- and the right limit exists (but unless in the continous case do not have to be equal and could be different from $f(x)$). It turns out that the only discontinuities such functions have are removable ones or jump discontinuities. If $f$ has a jump at $x_0$ and $c < f(x_0) < d$ with $f(x_0 - 0) < f(x_0 + 0)$ then $$ f^{-1}(c,d) \supseteq f^{-1}(c, f(x_0 - 0)) \cup f^{-1}((f(x_0 + 0, d)) \cup \{ x_0 \}. $$ I have the idea that iteratively applying this, and taking into acount that there are at most countable many such bad points (i.e. on the other ones the function is continuous) the above proof could be adapted to this setting, but I am stuck. So my question, am I on the right track, is this possible? Or how would you prove this?