Kunen's Set Theory (2011) raised some uncomfortable questions regarding my understanding of the metatheory. Let me present my questions in in terms of two perspectives, in both of which we use $\mathsf{ZFC}$ as the working example.
Perspective 1. The intuitive construction. We take metatheoretic notions such as structural induction on proof trees and inductive definition of well-formed formulas etc. as given. We also take basic facts about how models behave as given. We then syntactically prove as many $\varphi$s such that $\vdash_\mathsf{ZFC} \varphi$ as we can. Then, we observe that the model-theoretic interpretations of these can simulate the aforementioned basic metatheoretic assumptions (induction, facts about models, etc.). This is the justification that our metatheory is in fact $\mathsf{ZFC}$.
Perspective 2. The big picture. Now, instead of following the process, we try to elucidate what precisely we are dealing with all at once. To preserve simplicity, I will assume Platonism (although you are more than welcome to add another perspective, like formalism, in your answer; that would be very interesting). So, $\mathsf{ZFC}$ is true in the real world; this is our metatheory. In fact, the metatheory is all there is. In it, we can represent things like strings, formulas, inference rules, the relations $\vdash$, $\vDash$, models etc.
Now comes my problem. I don't know how to picture in what way our "really true" $\mathsf{ZFC}$ represents (or, more suggestively, simulates) things like sequents, models, etc. What kind of objects precisely am I dealing with here? Metatheoretic sentences? I currently have no idea.
The main motivation for finding this out is that if I have a more solid grasp of what's actually going on when working in the metatheory, I might be able to work in it more confidently. Thank you for all the help. :)