This post was moved from MathOverflow because it is not quite appropriate within the context of research mathematics.
My previous question on MathOverflow, "A weakening of cardinal compactness - is it equivalent?", was about an interesting property of strongly compact and weakly compact cardinals, in which theories with a bunch of small models imply the existence of a big model (roughly speaking).
The answer given by Yair Hayut was quite informative, and for the most part very intuitive and understandable. However, there was one part where I got lost.
"Let us assume now that $κ$ is measurable. Let $T$ be an $\mathcal{L}_{κ,κ}$-theory. Let us assume that for every $λ<κ$ there is a model $M_λ$ of size $≥$$λ$. Let $U$ be a $κ$-complete normal ultrafilter on $κ$ and let $j:V→Ult(V,U)$ be the ultrapower embedding." - This is Hayut's setup.
Here is the part of Hayut's answer which I am concerned with:
"Let us consider $M=j(\langle M_\alpha∣\alpha<\kappa\rangle)(\kappa)$. $M$ is a model of $j(T)$ in the model $Ult(V,U)$. Let $\mathcal{L}$ be the language of the theory $T$. For every $\varphi\in T$, $j(\varphi)$ is an $\mathcal{L}_{\kappa,\kappa}$-sentence in $j"\mathcal{L}$. By the closure of $Ult(V,U)$ under $κ$-sequences, since $Ult(V,U)⊨"M⊨j(φ)"$, we conclude that $V⊨"M⊨j(φ)"$... this is the part which I don't understand. $\varphi$ is an $\mathcal{L}_{\kappa,\kappa}$-sentence of any language.
Why does the ultrapower witnessing $0$-hugeness imply that if $Ult(V,U)\models(M\models j(\varphi)$ then $V\models(M\models\varphi)$? I understand why this is true for first-order finitary $\varphi$, but this is $\mathcal{L}_{\kappa,\kappa}$.
A simple proof shows that this cannot be generalized to $\mathcal{L}_{\lambda,\lambda}$-sentences for any $\lambda>\kappa$. What is special about $\mathcal{L}_{\kappa,\kappa}$ that implies it has this property?
Could anybody perhaps give a proof which goes through the steps a bit more slowly?