I'm confused about a very specific detail in the following solution
The second condition of the pumping lemma states that |xy| <= p.
We also know that w = xyz . In this case , w = $0^p1^p2^p $
Then how can 'y' ever reach the $1$s or the $2$s ?
The complete word has to be xyz and xy can't be bigger than p. I don't understand how 'y' can contain more than just zeros.
In other words, the only valid seperation of this word that I see is
- xy being only zeroes (because xy <= p and w = xyz)
- z being the rest (zeroes or not, $1$s and $2$s)

You are absolutely correct. The proof is badly written. The two thirds of the first case that you have marked and the second case are indeed impossible and therefore do not need to be treated, but just confuse the reader. Consequently, we see that already the language $\{0^n1^n: n\geq 0\}$ is non-regular, because exactly the same argument applies.
Maybe the authors have taken a proof via the pumping lemma for context-free languages that $\{0^n1^n2^n: n\geq 0\}$ is not context-free (while $\{0^n1^n: n\geq 0\}$ is), and have deleted the parts for the second pumping factor. This would lead pretty much to what you have posted here, I think.