I want to understand why empty L-structures are illegal/not allowed in classical FOL. I read here but didn't understand the answer. It seems to be that there is some inference rule in FOL (as described in these notes) that is illegal when the set in the L-structure is empty. But I don't understand why.
Is because of some inference rule with quantifiers that L-structures are illegal or is there a different reason why empty L-structures are disallowed?
I also asked:
Is the reason that vacuous statements are True because empty L-structures are illegal?
as a suggestion that perhaps vacuous statements are the reason but I still don't have an answer for that that explains it. Perhaps its the inference rule. Perhaps its something else. I don't know yet.
If we do a careful analysis of the rules of inference for quantifiers in natural deduction or the sequent calculus, we see that the rules for quantifiers don't require the domain to be non-empty. (So this is one answer for what [primitive] rules are valid in an empty domain.)
I'll focus on the universal elimination rule. (Dually, the existential introduction rule has the same issue.) It is often written as $$\frac{\Gamma\vdash\forall x.\varphi(x)}{\Gamma\vdash\varphi(t)}\rlap{\small{\forall I,t}}$$ where $\Gamma$ is a finite set of formulas and $t$ is some term. Written this way, this rule seems to warrant deriving $\bot$ from $\forall x.\bot$, i.e. proving $\neg\forall x.\bot$ which is equivalent to $\exists x.\top$, i.e. the domain being non-empty.
The "trick" is that to apply this rule of inference, we need to have a term, $t$, even if it doesn't occur in the the resulting formula. Here's where a more careful analysis can be helpful. Write $\Gamma\vdash_V \psi$ to mean $\Gamma\vdash\psi$ except that all formulas are only allowed to contain free variables in the set $V$. (Really, $\Gamma\vdash_V\psi$ is the more primitive notion and is useful for defining the universal introduction rule.) More cleanly, we can define a set of terms with free variables in a given set $V$, and (thus) a set of formulas with free variables in $V$. $\vdash_V$ is then a (meta-)relation on sets of formulas with free variables in $V$ and a formula with free variables in $V$. A (conditional) proof is then a derivation of a formula with no free variables, i.e. $\Gamma\vdash_\varnothing\psi$.
To (indirectly) show that the domain is non-empty, we'd want to show $\vdash_\varnothing \neg\forall x.\bot$. Here's an attempted derivation: $$\dfrac{\dfrac{\dfrac{}{\forall x.\bot\vdash_\varnothing\forall x.\bot}\rlap{\small{Ax}}}{\forall x.\bot\vdash_\varnothing\bot}\rlap{\small{\forall I,?}}}{\vdash_\varnothing\neg\forall x.\bot}\rlap{\small{{\to}I}}$$
This derivation fails because, assuming there are no constants in our signature, we have no terms (with no free variables) at all to use to apply universal elimination.
What happens in many contexts, particularly for Hilbert-style systems, is the set of terms always contains all possible variables. In that context, it is always possible to find a term: just pick some free variable. While indexing terms and formulas by sets of free variables is slightly more complicated than simply having a single set of terms/formulas, I find it to be significantly cleaner and clearer.
We get the semantic analogue of the above as follows. If we the set of terms always includes all free variables, then to interpret terms we must say what is done with free variables. This means our interpretation function must specify the values for (typically countably infinitly many) free variables. This forces our semantic domain to be non-empty and is rather awkward. For example, we have uncountably many effectively equivalent interpretations even when we pick a finite semantic domain. If we index our interpretation functions by sets of free variables, then we only need to give values for the finite set of free variables we actually use. In particular, when that set is empty, we only need to give values for closed terms. Since there may be no closed terms, this allows the domain to be empty.
So, as Eric Wofsey said as a comment in your other question, it's a bit of a "historical artifact". Quite possibly from something that was somewhat accidental. The core invalid rule, at least from the perspective of the natural deduction/sequent calculus, is doing universal elimination/existential introduction in a context where you have no terms. There are, of course, infinitely many derivable rules from this invalid rule that would also be invalid.