This question is from Abbott's Understanding Analysis:
"If $f$ is not continuous, it may not be possible to find tags for which $R(f,P)=U(f,P)$. Show, however, that given an arbitrary $\epsilon>0 $, it is possible to pick tags (Assuming that $f$ satisfies the $\epsilon$-$\delta$ definition of riemann integrability) for a partition $P$ such that $U(f,P)-L(f,P)< \epsilon$."
Now, we are given that $f:[a,b]\rightarrow \mathbb{R}$ is bounded and Riemann-Integrable, but not continuous on $[a,b]$.
Proof (Outline).
Let $\epsilon > 0$ be arbitrary but fixed.
Assume that $f$ is not continuous, and that there is no choice of tags $\{\zeta_{k}\}$ for which $U(f,P)=R(f,P)$.
[ ...
... details which are not relevant to my question
... ]
We can choose $\delta >0$, and let $P_0$ be a partition of $[a,b]$ with mesh$(P)<\delta$
[...]
since $M_k - f(\zeta_k)<\dfrac{\epsilon}{n\delta}$ (We assumed that the $\epsilon$-$\delta$ definition is satisfied, so we can find this, but I left these details out) for all $\zeta_k\in I_k$, then taking the sum over all $n$ intervals in $P_0$ yields:
$$\sum_{k=1}^n (M_k-f(\zeta_k))<\frac{\epsilon}{\delta}$$ $$\Rightarrow \sum_{k=1}^n (M_k-f(\zeta_k))\delta<\epsilon$$ $$\sum_{k=1}^n (M_k-f(\zeta_k))\Delta x_k <\sum_{k=1}^n (M_k-f(\zeta_k))\delta<\epsilon$$ So $U(f,P_0)-R(f,P_0)<\epsilon$.
QUESTION
Alright, so for some $z\in I_k$, lets denote $f(z)=M_k$ for $z\in I_k$.
Now by assumption there are no tags $\zeta_k$ for which $f(\zeta_k)=f(z)$,} (and so $f$ is discontinuous at $z$?).
So since $f(z)$ is the supremum of all values of $f$ in $I_k$, and no values of $\zeta_k$ satisfy $f(\zeta_k)=f(z),$ then $f(z)>f(\zeta_k)$ for all $\zeta_k\in I_k$. But $z\in I_k$... Isn't this a contradiction ? Why can't I just set $\zeta_k=z$? More specifically, why is $f(z)>f(\zeta_k)$ for every $\zeta_k$? Most importantly, am I looking at this the wrong way? I can't even think of any examples that fit this situation.
You should not worry about properly chosen tags, but make sure that for any admissible choice of tags you obtain the same Riemann sum up to $\pm\epsilon$. If $f$ is continuous on $[a,b]$ then uniform continuity takes care of that. If $f$ is not continuous throughout you have to be able to fence in the discontinuities into small intervals. Your partition then has good intervals, where $f$ is continuous, and bad intervals where your only control over $f$ is that $|f(x)|\leq M$. You then have to set up things in such a way that the total error accumulating on the good intervals (of total length $\leq b-a)$ is $<{\epsilon\over2}$, and the total error accumulating on the bad intervals ($\leq2M\cdot$ small total length of these intervals) is $<{\epsilon\over2}$ as well.
(I have the impression that you are chasing the wrong hare. Proving a certain theorem is one thing, but when this is over we just set up sufficiently fine, but otherwise arbitrary Riemann sums and then can be sure that their value is not far off the value of the integral.)