I know that this a beginner's question asked too many times, but I still didn't get an answer which lets me quit asking:
Given that a model/interpretation of a theory (in the Tarskian sense) is a set with some structure, how can there be models of set theory, since we know, that the class of all sets (the range of the quantifiers of set theory) is not a set?
I especially wonder why
a) it is stressed so often and so strongly that models must be sets?
b) nevertheless sometimes proper classes are allowed? (see Wikipedia's Inner Model Theory: "models are transitive subsets or subclasses")
There are several reasons why model theory texts only look at models that are sets. These reasons are all related to the fact that model theory is itself studied using some (usually informal) set theory.
One benefit of sticking with set-sized models is this makes it possible to perform algebraic operations on models, such as taking products and ultrapowers, without any set-theoretic worries.
Another benefit of requiring models to be sets is this makes it possible to define the satisfaction relation $\vDash$ for each model. In other words, given a model $M$ in a language $L(M)$, we want to form $T(M) = \{ \phi \in L(M) : M \vDash \phi\}$. This can be done when $M$ is a set, by going through Skolem normal form. But it cannot be done, in general, when $M$ is a proper class, because of Tarski's undefinability theorem. In particular, if we let $M$ be the class-sized model $V$ of the language of set theory then Tarski's theorem shows that $T(M)$ is not definable in $V$. We can define the truth of each individual formula (using the formula itself) but in general there may be no global definition of truth in a proper-class-sized model.
Moreover, in model theory, there is no real need to look at proper-class-sized models, because there is already enough interesting behavior from set-sized models. The motivating examples are all sets (algebraic structures, partial orders, etc). And the completeness theorem shows that any consistent theory has a set-sized model (this includes ZFC). So model theorists generally restrict themselves to set-sized models.
Generally, people are only interested in proper-class-sized models in the context of set theory. The reason for the interest is that ZFC can't prove that there is a set model of ZFC (because ZFC can't prove Con(ZFC)), but it is possible to form proper-class-sized models of ZFC from a given proper-class-sized model of ZFC (e.g. the inner model $L$). This allows for some model-theoretic results about set theory, but many things that are taken for granted in model theory have to be re-checked when we move to proper-class-sized models. In general the re-checking is often routine, and it only comes up in advanced settings, where an author is not likely to make a big fuss about it. The benefit of this labor is that we can sometimes avoid having to assume Con(ZFC) as a hypothesis for a theorem about models of set theory.
In summary, in any non-set-theoretic context, "model" will mean "set-sized model". In the context of set theory, this is still what "model" usually means; they usually say "inner model" or "class model" for a proper-class-sized model. But some attention to context is needed when you are working with "models" of set theory to make sure you read what the author intended.