I'm studying from Introduction to Lambda Calculus book, and working on proving this substitution lemma from one of the exercise:
2.2. Prove the following substitution lemma. Let $x \not \equiv y$ and $x \not \in FV(L) $. Then $M[x := N][y := L] \equiv M[y := L][x := N[y := L]]$.
I wonder if I can substitute variables in an expression out of order, e.g. $M \equiv \lambda x.\lambda y.xxyy$ and I want to pass $y = a$, while keeping outer $\lambda x$ unevaluated/unexpanded, i.e. $M[y:=a] = \lambda x.xxaa$. Is this something permissible in Lambda Calculus ? I guess what I want to know is what would be the equivalent of $M[y:=a]$ in $(\lambda x.\lambda y.xxyy)ab$ form of expansion/evaluation.
Thanks!
You seem to be conflating substitution with $\beta$-reduction.
Substitution $[y:=a]$ is a purely syntactic operation that replaces every free occurrences of $y$ in $M$ by $a$. There is nothing "evaluated" in a substitution.
$\beta$-reduction $(\lambda x.M)N \triangleright_\beta ...$ is an evaluation procedure that uses substitution, and in which every occurrence of $x$ that is bound by the outermost abstraction ($\lambda x$) is replaced by the term that the $\lambda$-term is applied to ($N$).
W.r.t. your example where $M :\equiv (\lambda x.\lambda y.xxyy)$, $M[y:=a] \equiv M$: Since all occurrences of $y$ in $M$ are bound by $\lambda y$, there are no free occurrences of $y$ to be replaced, so when applying the substitution, nothing is changed. The result is not $\lambda x.xxaa$. But this is not a question of order, and even less one of semantic "evaluation": You would get the same result for $M' :\equiv (\lambda y.\lambda x. xxyy)$; the only thing that matters for the syntactic operation of a substitution is whether $y$ is bound by some $\lambda y$ or not. In both $(\lambda x.\lambda y.xxyy)$ and $(\lambda y.\lambda x. xxyy)$, all occurrences of $y$ are bound, so in both terms, substituting $[y:=a]$ will yield the same term again.
You would get a different result if you did, for example, $(\lambda x.xxyy)[y:=a]$: Here, the two occurrences of $y$ are free, so the result of the substitution is $\lambda x.xxaa$. That there is also a $\lambda x$ in front of $xxyy$ does not matter at all. All you do by $(\lambda x.xxyy)[y:=a]$ is replace every free occurrence of $y$ in the term $\lambda x.xxyy$ by $a$.
The point where evaluation and order enters is beta reduction, and beta reduction is defined in terms of substitution - not the other way round. Substitution itself lives independently of beta reduction.
By definition $(\lambda x.M)N \triangleright_\beta M[x:=N]$. So you replace every free occurrence of $x$ in $M$ - $M$ is the term with the $\lambda x$ stripped away, so the occurrences of $x$ which are bound by $\lambda x$ are free in $M$ (if they are not bound by some other abstraction), and these are the ones to be substituted by the term $N$. Beta reduction works because the substitution happens on $M$, without the $\lambda x$. If you did $(\lambda x. M)[x:=N]$, nothing would happen because no $x$ is $(\lambda x.M)$ is free, so nothing would get substituted - just like in $(\lambda x.\lambda y.xxyy)[y:=a]$.
In terms of order, all the definitions tell you is that when applying beta reduction, you strip the leftmost $\lambda$ away, so in the case of beta reduction, substitution happens for the outmost $\lambda$-abstracted variable - but, again, in the subterm without the $\lambda$-abstraction.
But substitution itself is independent of beta reduction, you can do $(\lambda xy.xxyy)[x:a]$ or $(\lambda xy.xxyy)[y:a]$ or whatever you like. There is no evaluation involved, it's just a syntactic definition that has some outcome. But note that when doing $(\lambda xy.xxyy)[y:=a]$, you substitute in a term which still includes an abstraction $\lambda y$, substitution doesn't make the $\lambda y$ go away, so since all the $y$'s are bound, there is nothing to be replaced and the result will be the same term again.
It looks like what you tried to do is not a substitution where you "skipped" $\lambda x$ (there is no such thing as skipping in substitutions, you just go through the term and look which occurrences of the variable are free, and those which are get replaced), but what you apparently try to do is a beta reduction where you skip the outermost abstraction, $\lambda x$, i.e. a reduction of some application $(\lambda xy.xxyy)a$ but where $a$ goes in for $\lambda y$. This not possible: By definition, beta reduction operates on the outermost abstraction, so
$(\underbrace{\lambda x.}_{(\lambda x.} \underbrace{\lambda y.xxyy}_{M)})\underbrace{a}_{N}$
$\triangleright_\beta \underbrace{(\lambda y.xxyy)}_{M}[\underbrace{x}_{x}:=\underbrace{a}_{N}]$ (def. beta reduction)
$\equiv \underbrace{(\lambda y.aayy)}_{M[x:=N]}$ (def. substitution).