Setting
Definition: A theory $\pmb{T}$ has a $\forall\exists$-axiomatization if it can be axiomatized by sentences of the form $$\forall v_1\ldots \forall v_n \exists w_1 \ldots \exists w_n ~~ \phi(\bar{v},\bar{w})$$ where $\phi$ is a quantifier free $\mathcal{L}$-formula.
I am doing a multi-step proof to show $T$ has $\forall\exists$-axiomatization.
Step one: suppose whenever $(\mathcal{M}_i : i \in \mathbb{I})$ is a chain of models of $\pmb{T}$, then I showed that: $$\mathcal{M} = \bigcup \mathcal{M}_i \models \pmb{T}.$$
Looking ahead: Now I am showing the converse also holds. Suppose whenever $(\mathcal{M}_i, i \in \mathbb{I})$ is a chain of models of T, then $\bigcup \mathcal{M}_i$ is a model of T. Let $\Gamma = \{ \phi : \phi \text{ is a $\forall\exists$-sentence and $\pmb{T} \models \phi$}\}$. Let $\mathcal{M} \models \Gamma$. I will show $\mathcal{M} \models T$.
Step two: Now I want to show that there is $\mathcal{N} \models \pmb{T}$ such that if $\psi$ is an $\exists\forall$-sentence and $\mathcal{M} \models \psi$, then $\mathcal{N} \models \psi$.
My Attempt at step two
We construct the theory $\pmb{T}^{*} = diag_{\forall}(\mathcal{M}) \cup \pmb{T}$, where $diag_{\forall}$ is the set of for all sentences satisfied by $\mathcal{M}$. If $\pmb{T}^{*}$ is satisfiable then let $\mathcal{N} \models \pmb{T}^{*}$.
Now we show $\pmb{T}^{*}$ is consistent. By compactness, it suffices show some finite subset of $\pmb{T}^{*}$ is satisfiable. Let $\pmb{T}^{*'} = diag_{\forall}' \cup \pmb{T}'$, where $diag_{\forall}'$ and $\pmb{T}'$ are finite subsets of $diag_{\forall}$ and $\pmb{T}$. Next let: $$\gamma = \forall \bar{v} \exists \bar{w} \phi(\bar{v},\bar{w}) \in \pmb{T}', \quad \theta = \forall \bar{x} \rho(\bar{x}) \in diag_{\forall}'.$$
Since $\pmb{T}$ is consistent, we can assume $\mathcal{N} \models \pmb{T}$, therefore $\mathcal{N} \models \gamma(\bar{a},\bar{b})$ for some $\bar{a},\bar{b} \in \mathbb{N}$. Now we show for each $\theta \in diag_{\forall}'$:
$$\mathcal{N} \models \gamma \Rightarrow \mathcal{N} \models \theta.$$
So we have:
$$\mathcal{N} \models \gamma ~~\Rightarrow~~ \mathcal{N} \models \forall \bar{v} \exists \bar{w} \phi(\bar{v},\bar{w}) ~~\Rightarrow~~ \mathcal{N} \models \forall \bar{v} \phi(\bar{v},\bar{b}) ~~~ \text{for some } \bar{b} \in \mathbb{N} ~~\Rightarrow~~\\ \mathcal{N} \models \forall \bar{x} \rho(\bar{x}) ~~\Rightarrow~~ \mathcal{N} \models \theta.$$
Where we simply renamed $\phi$ as $\rho$ and $\bar{v}$ as $\bar{x}$. Thus each $\pmb{T}^{*'}$ is consistent so $\pmb{T}^{*}$ is consistent, so $\mathcal{N} \models \pmb{T}^{*}$.
Next we show if $\mathcal{N} \models \pmb{T}$ and $\mathcal{M} \models \psi$, then $\mathcal{N} \models \psi$. Let $\psi = \exists \bar{v} \forall \bar{w} \sigma(\bar{v},\bar{w})$ where $\sigma$ is quantifier free. Then: $$\mathcal{M} \models \exists \bar{v} \forall \bar{w} \sigma(\bar{v},\bar{w}) ~~\Rightarrow~~ \mathcal{M} \models \forall \bar{w} \sigma(\bar{a},\bar{w}) \text{ for some } \bar{a} \in \mathbb{M}.$$
And since $\mathcal{N} \models diag_{\forall}(\mathcal{M})$, we have:
$$\mathcal{N} \models \forall \bar{w} \sigma (\bar{a},\bar{w}) ~~\Rightarrow~~ \mathcal{N} \models \exists \bar{v} \forall \bar{w} \sigma (\bar{v},\bar{w}) ~~\Rightarrow~~ \mathcal{N} \models \psi.$$
My Problem
I let $\mathcal{N} \models \pmb{T}$ by assumption, using the fact that $\pmb{T}$ is consistent. Is this justified and proper in this context?
When unraveling the implication of $\mathcal{N} \models \gamma$ , is it correct to evaluate it at some $\bar{b} \in \mathbb{N}$?
Here is my attempt. Fix a theory $T$, let $\Gamma = \{\phi \ : \ \phi \in \Pi_2 \cap Cl(T)\}$, suppose $M\models \Gamma$. Then $T$ is consistent, as, otherwise $M\models \forall x \exists y (x\neq x)$.
Let $S = \Sigma_2 \cap Th(M)$. We want to show that $T \cup S$ is consistent. It suffices to show that $T \cup \Delta$ is consistent, for every finite $\Delta \subseteq S$. Write $$\Delta = \{\phi_1,\ldots,\phi_n\} = \{\exists \bar{x}\forall\bar{y}\psi_1(\bar{x},\bar{y}),\ldots, \exists \bar{x}\forall\bar{y}\psi_n(\bar{x},\bar{y})\}$$ Where $\psi_i$ are quantifier free. If $T\cup \Delta$ is consistent, we are done, otherwise $$ T \models \bigwedge_{i\leq n} \phi_i \rightarrow \bot $$ So $T \models \bigvee \neg \phi_i$, more explicitly $T\models \bigvee \forall \bar{x} \exists \bar{y} \neg \psi_i(\bar{x},\bar{y})$. Hence $T\models\forall \bar{x} \exists \bar{y} \bigvee \neg \psi_i(\bar{x},\bar{y})$. This last sentence is a $\Pi_2$ consequence of $T$, hence, it $M$ is a model for it. Therefore $M\models \forall \bar{x} \bigvee \neg \forall \bar{y} \psi_i$, so finally $M \models \neg \exists \bar{x} \bigwedge \forall \bar{y} \psi_i$.
However $M\models \bigwedge \phi_i$ so $M\models \bigwedge\exists \bar{x} \forall \bar{y} \psi_i(\bar{x},\bar{y})$,
Edit, the following is incorrect: thus $M \models \exists\bar{x} \bigwedge \forall \bar{y}\psi_i(\bar{x},\bar{y})$, but this contradicts the last paragraph.
This establishes that $T \cup S$ is consistent.
So at this point I am not sure how to proceed. I will leave this here in case I want to edit it later.