I need a little help in finishing and polishing this proof. My proof goes as follows:
Let $C$ and $D$ be $min(f(x))$ and $max(f(x))$ respectively, where $x \in [a,b]$.. Given $\epsilon > 0$, we can divide $[C,D]$ into partitions $C = y_0 < y_1 < Y_2 < ... < y_n = D$ s.t $y_i - y_{i-1} = \epsilon$ where $i=1,...,n$
We also divide $[a,b]$ into partitions in the following way. Let $a = x_0$. We define $x_1$ to be the smallest $x$ s.t $f(x) = y_i$, and define $x_2$ to be the smallest $x>x_1$ s.t $f(x)$ is equal to either $y_i, y_{i+1}, y_{i-1}$. This is done up to $x_n = b$.
Vi know that $f$ is continuous on $[x_{i-1}, x_i]$. The way we have constructed the interval assures us that $|f(x_{i-1}) - f(x)| < \epsilon$ where $x \in [x_{i-1}, x_i] $ When we then take the linear segment between the two end points $x_{i-1}, x_i$ to get our polygon function, it then has to be the case that $|f(x) - g(x)| < \epsilon$. 
This is an illustration of what i am trying to formalise. I know it is a bit sloopy in the end, thus I am looking for some tips?
(In the following answer, I'll make use of the fact that all closed-and-bounded subsets of $\mathbb R$ have least elements.)
Most of the construction you have done is valid (in the sense that it makes sense/ can be justified).
By the extreme value theorem, there will be a minimum and maximum ($C$ and $D$) of $f$ attained in $[a,b]$.
The partition of $[C,D]$ has a slight issue, in that you require the difference between consecutive $y_i$s to be exactly equal to $\epsilon$. This can result in the partition being 'uneven' - for instance, try partitioning $[0,1]$ into segments of length $0.6$. But, if all you care about is being able to make $\epsilon$ arbitrarily small instead of equal to any positive real, you can remedy this by supposing $\epsilon$ is always of the form $\frac{D-C}{n}$ for $n\in \mathbb Z^+$.
Since $f$ is continuous, for any $y_i\in [C,D]$, $f^{-1}(y_i)$ will be (by the intermediate value theorem) a nonempty closed subset of $[a,b]$ so it will have a least element. This means the union of all $f^{-1}(y_i)$s will have a least element in $[a,b]$. So the construction of $x_1$ is valid.
However, the recursive definition of the general $x_i$ causes issues.
Say for instance $[a,b]=[1,10]$ and $x_1=2$ with $y_1=3$ (so $f(2)=3$). If $f$ equals $3$ on some closed interval, $[2,a]$, $a\in (2,10]$ (effectively, if $f$ flattens out at a height of $3$ over some interval after reaching $2$), then $x_2$ will be undefined. Because, for any $x$ between $2$ and $a$, we'll have $f(x)=3$. So, for instance, $f(2.01)$ may equal $3$. But so will $f(2.001)$ and $f(2.0001)$ and so on. In this situation, there is no minimum value we can assign $x_2$, so it is undefined.
To fix this, we must tweak the recursive definition of the $x_i$. You could say instead that $x_i$ is the smallest $x\geq x_{i-1}$ for which $f(x)=y_{i-1},y_i,$ or $y_{i+1}$, but this would, of course make all the $x_i$s equal $x_1$, so we wouldn't get a partition.
The only other option is to instead change the definition of the $x_i$s so that, for any $i>1$, $x_i$ is the smallest $x>x_{i-1}$ for which $f(x)=y_{i-1}$ or $y_{i+1}$. So the possibility of $f(x_{i-1})=f(x_i)$ must be excluded if the construction is to work for a general continuous $f$.
But this still poses an issue, for what if $f$ never waivers beyond the interval $(f(x_{i-1})-\epsilon,f(x_{i-1})+\epsilon)$ after reaching $x_{i-1}$ (this could happen if, again, $f$ becomes constant after reaching $x_{i-1}$). In this case, $x_i$ will again be undefined.
So, we make one final adjustment to the definition.
For any $i>1$, suppose $x_{i-1}<b$ has been defined. Define $x_i$ as follows: if $f^{-1}([f(x_{i-1})+\epsilon,\infty]\cup [-\infty, f(x_{i-1})-\epsilon])\cap [x_i,b]$ is nonempty, then define $x_i$ to be the smallest element of $f^{-1}([f(x_{i-1})+\epsilon,\infty]\cup [-\infty, f(x_{i-1})-\epsilon])\cap (x_{i-1},b]$ ; else, set $x_i=b$.
Now, we must check that this definition makes sense, and allows us to construct our desired $g$.
We know that the construction of $x_1$ is valid. Suppose we have constructed $x_n<b$ for some $n$. We wish to show that $x_{n+1}$ is well-defined and that $|f(x_n)-f(x)|<\epsilon$ for $x\in [x_n,x_{n+1})$ (note that $|f(x_n)-f(x_{n+1})|$ will generally equal $\epsilon$ as the $y_i$s have been defined to be spaced apart by a distance of exactly $\epsilon$ - so we consider $[x_n,x_{n+1})$ instead of $[x_n,x_{n+1}]$).
If $|f(x_n)-f(x)|<\epsilon$ for all $x\in [x_n,b]$, then $x_{n+1}=b$, and it is automatically the case that $|f(x_n)-f(x)|<\epsilon$ for $x\in [x_n,x_{n+1})$.
If $|f(x_n)-f(x)|\geq\epsilon$ for some $x\in [x_n,b]$, then there exists some $z\in (x_n,b]$ s.t. $f(z)\geq f(x_n)+\epsilon$ or $f(z)\leq f(x_n)-\epsilon$. So, by the intermediate value theorem, we conclude there exists some $w\in (x_n,b]$ s.t. $f(w)=f(x_n)+\epsilon=y_{n+1}$ or $f(w)=f(x_n)-\epsilon=y_{n-1}$.
This means the set $f^{-1}(\{f(x_n)+\epsilon\}\cup \{f(x_n)-\epsilon\})\cap [x_n,b]$ is a nonempty closed subset of $[x_n,b]$ and all its members are $>x_n$. So $f^{-1}(\{f(x_n)+\epsilon\}\cup \{f(x_n)-\epsilon\})\cap (x_n,b]$ has a least element. This least element is, by definition, $x_{n+1}$. So $x_{n+1}$ is well-defined. Now, if there was any $p\in [x_n,x_{n+1})$ s.t. $|f(x_n)-f(p)|\geq \epsilon$, then by the intermediate value theorem again, there would exist a $q\in [x_n,p]\subset [x_n,x_{n+1})$ s.t. either $f(q)=f(x_n)+\epsilon$ or $f(q)=f(x_n)-\epsilon$. But this contradicts the minimality of $x_{n+1}$.
So we conclude that, for all pairs of consecutive points, $x_n, x_{n+1}$, whenever $x\in [x_n,x_{n+1})$, $|f(x_n)-f(x)|<\epsilon$ as desired.
There is one last technical point that needs addressing - that the sequence of $x_i$s is finite. If the sequence of $x_i$s is infinite, the resulting function $g$ wouldn't really be a true polygonal approximation. So suppose by way of contradiction that the sequence of $x_i$s was infinite. We use the fact that any closed interval in $\mathbb R$ is limit point compact. What this means is that any infinite subset of a closed interval has a limit point in the closed interval. So let $\gamma$ be a limit point of the set of $x_i$s. Then, for any $\delta>0$, we can find a pair of points $x_{k-1}$ and $x_k$ that lie within a distance $\delta$ of $\gamma$. But, since $|f(x_{k-1})-f(x_k)|\geq \epsilon$, we cannot have both $|f(\gamma)-f(x_{k-1})|<\frac{\epsilon}{2}$ and $|f(\gamma)-f(x_k)|<\frac{\epsilon}{2}$. Since we can find such a pair, $x_{k-1},x_k$, within any $\delta$ radius of $\gamma$, $f$ cannot be continuous at $\gamma$, a contradiction.
So the sequence of $x_i$s is always finite and now we can finally conclude the existence of the polygonal approximation, $g$.
If we wanted to, we could also show that for each $y_i$ there exists a corresponding $x_i$ under the method of construction outlined above. If we suppose this is not the case, we may pick a $y_m$ that is not the image of any $x_i$. By the IVT, we have $y_m=f(z)$ for some $z$. There exists at least one $x_i<z$, as otherwise, since $f(z)=y_m$, $z$ would equal $x_1$. So there also exists some largest $x_i$, $x_k$, that is $<z$. Clearly $f(x_k)\neq y_m= f(z)$, by assumption. But then, $f(x_k)=y_j$ for some $j\neq m$, so $|f(x_k)-f(z)|=|y_j-y_m|\geq \epsilon$, so, by the IVT, there exists some $w\in (x_k,z]$ s.t. $f(w)=y_{j+1}$ or $y_{j-1}$. It then follows that $x_{k+1}<z$, contradicting the maximality of $x_k$.
Lastly, if you just loosen up the requirements on the construction technique a little, there is an easier way one can build $g$. A property of continuous functions on closed intervals is that they are also uniformly continuous. This allows for a much more straightforward construction of such a $g$.
Also, the tweaking of the definition of the general $x_i$ is necessary even if the function $f$ you are dealing with is differentiable, continuously differentiable, or even smooth.