I have seen the following statements of Gödel's Incompleteness Theorems:
Gödel's First Incompleteness Theorem (v1) If $T$ is a recursively axiomatized consistent theory extending PA, then $T$ is incomplete.
Gödel's Second Incompleteness Theorem (v1) If $T$ is a recursively axiomatized consistent theory extending PA, then $T \nvdash \text{Con}(T)$.
However, these theorems are often applied to ZFC, and ZFC is not an extension of PA. For one, ZFC and PA do not have the same languages, since ZFC's language is $\mathcal{L}_{\text{ZFC}} = \{ \in \}$, and PA's language is $\mathcal{L}_{\text{ZFC}} = \{ 0 , S \}$. So, I thought it might make sense to revise these statements as follows:
Gödel's First Incompleteness Theorem (v2) If $T$ is a recursively axiomatized consistent theory such that an expansion of $T$ extends an expansion of PA, then $T$ is incomplete.
Gödel's Second Incompleteness Theorem (v2) If $T$ is a recursively axiomatized consistent theory such that an expansion of $T$ extends an expansion of PA, then $T \nvdash \text{Con}(T)$.
My idea here was to expand ZFC and PA to a language $\mathcal{L}_{\text{ZFC, PA}} = \{ \in, 0, S \}$, which includes all of the symbols of $\mathcal{L}_{\text{ZFC}}$ and $\mathcal{L}_{\text{PA}}$, allowing us to compare the two theories. However, these statements of the theorems still do not apply to ZFC, since the expansion of ZFC to $\mathcal{L}_{\text{ZFC, PA}}$ is not an extension of PA's expansion to $\mathcal{L}_{\text{ZFC, PA}}$, as the axioms of PA only apply to natural numbers, not to arbitrary sets.
So, what are precise statements of Gödel's Incompleteness Theorems? All I can find are statements involving terms like "containing PA" or "at least as strong as PA," and it's not clear to me what these terms mean. I have also seen the word "interpret" used, but I only know what it means for a structure to be interpreted in another structure, not for a theory to be interpreted in another theory.
Your last sentence is the key. Setting aside the issue of optimizing for strength (e.g. replacing $\mathsf{PA}$ by a weaker theory of arithmetic such as $\mathsf{Q}$), the following is a language independent presentation:
G2IT is a bit messier since we have to talk about how consistency is expressed, but we can still do that:
So what's an interpretation of one theory in another?
Given theories $T,S$ in relational (purely for simplicity) languages $\Sigma,\Pi$ respectively, an interpretation of $T$ in $S$ is basically a uniform way of interpreting in each model of $S$ a model of $T$. Precisely, an ($n$-dimensional) interpretation is a tuple of formulas $\Phi$ consisting of:
a domain formula $\delta(x_1,...,x_n)$ and an equivalence formula $\eta(x_1,...,x_{2n})$, and
for each $k$-ary relation symbol $R$ in $\Sigma$, a $kn$-ary formula $\varphi_R(x_1,...,x_{kn})$,
such that
$S$ proves that $\delta$ defines a nonempty set and $\eta$ defines an equivalence relation on that set, and
for each sentence $\tau$ in $T$, $S$ proves the "translation" of $\tau$ into the language of $S$ gotten by $(i)$ replacing each variable with an $n$-tuple of variables, $(ii)$ bounding quantifiers over such $n$-tuples to $\delta$, $(iii)$ replacing $=$ with $\eta$, and $(iv)$ replacing each $R$ with $\varphi_R$.
(Incidentally, much of the time we can get away with $n=1$ and $\eta(x_1,x_2)\equiv x_1=x_2$. So feel free to drop those bits on first read.)
For example, the "usual" interpretation $\Theta$ of (relationalized) $\mathsf{PA}$ into $\mathsf{ZFC}$ has
$n=1$,
$\delta(x)$ is the usual definition of $\omega$ ("$x$ is the smallest infinite ordinal"),
$\eta$ is just equality, and
the $\varphi$s corresponding to the graphs of $+$ and $\times$ are rather messy formulas built by recursion (think about how we define ordinal addition and multiplication).
For example, $\mathsf{ZFC}$ proves that finite ordinal addition is commutative, which is to say that as hoped the $\Theta$-translation of the $\mathsf{PA}$-axiom "$\forall x,y[x+y=y+x]$" is $\mathsf{ZFC}$-provable.
This should all be very familiar, and in particular we have trivially that if $T$ is interpretable in $S$ then for each $\mathcal{M}\models S$ there is some $\mathcal{N}\models T$ with $\mathcal{N}$ interpretable in $\mathcal{M}$ in the sense you're already familiar with.