Consider a finitely axiomatized theory $T$ with axioms $\phi_1,...,\phi_n$ over a first-order language with relation symbols $R_1,...,R_k$ of arities $\alpha_1,...,\alpha_k$. Consider the atomic formulas written in the form $(x_1,...,x_{\alpha_j})\ \varepsilon R_j$.
Translate this theory into a (finite) set-theoretic definition
$T(X) :\equiv (\exists R_1)...(\exists R_k) R_i \subseteq X^{\alpha_i} \wedge \phi'_1 \wedge ... \wedge \phi'_n$
where $\phi'_i$ is $\phi_i$ with $(\forall x)$ replaced by $(\forall x \in X)$ and $(x_1,...,x_{\alpha_j})\ \varepsilon R_j$ replaced by $(x_1,...,x_{\alpha_j})\ \in R_j$ with $(x_1,...,x_{\alpha_j})$ an abbreviation for ordered tuples.
To show that $T$ has a model — i.e. to show that $T$ is consistent — is to prove the statement $(\exists x) T(x)$ from the axioms of set theory.
It is essential that the relations fulfill the conditions $\phi_i$ simultaneously. Thus it is not clear at first sight, how the existence of a model of a theory can be proved (or even be stated set-theoretically) that is not finitely axiomatizable, since it cannot be translated into a finite sentence.
Some other things are not clear (to me):
In this setting, doesn't the consistency of every theory dependend on the consistency of the choosen set theory? (If the set theory isn't consistent, every theory has a model.)
Furthermore, doesn't the consistency of a theory depend on the choice of the set theory in which $(\exists x) T(x)$ is proved? (In some set theories $(\exists x) T(x)$ can be proved, in others maybe not.)
What conditions has a theory to fulfill to be able to play the role of set theory in this setting? [It doesn't have to be the element relation $\in$ which $\varepsilon$ is mapped on. But one needs to be able to build ordered tuples of arbitrary length. What else? Something like powersets (since $R_i \subseteq X^{\alpha_i}$ is $R_i \in \mathcal{P}(X^{\alpha_i})$)? Is extensionality necessary? What is the general framework to discuss such questions?]
The method you describe for making "set-theoretical sense" of consistency of first-order theories is correct, but only applicable to finitely axiomatisable theories. Another possibility is to encode formulas by sets. Something along these lines (let $\phi_\alpha$ denote the formula encoded by $\alpha$):
Now you can have a formula $models(X, R_1, ..., R_k, \alpha)$, which would express "the structure $\langle X, R_1, ..., R_k \rangle$ is a model of $\phi_\alpha$". Now if you have a theory (i.e. a set of formulas) $T$, then its consistency can be expressed by the formula $\exists X, R_1, ..., R_k \forall \alpha \in T~models(X, R_1, ..., R_k, \alpha)$.
However, one can also take the syntactic approach (i.e. a theory is consistent if and only if $\bot$ is not provable) and express the consistency in arithmetic. Now to your questions:
It is not strictly correct to say that the consistency of a theory depends on set theory. However if your set theory is inconsistent it can prove everything, including the consistency of every theory.
Yes. Apart from trivial examples. ZFC does not prove $con(\rm ZFC)$, but ZFC + "there is an inaccessible cardinal" does.
As previously noted, any theory that includes arithmetic can express consistency.