Lambda calculus is equipped with a primitive $=$ with the following definition:
(1) For any variable $x$ and lambda term $M, N$, $\left(\lambda x. M\right) N = M\left[x := N\right]$;
(2) For any term $M$, $M = M$;
(3) For any terms $M, N$, $M = N$ implies $N = M$;
(4) For any terms $M, N, L$, $M = N$ and $N = L$ implies $M = L$.
On the other hand, we have the following definition for $=_{\beta}$ ($\beta$-conversion): For any terms $M, N$, $M =_{\beta} N$ if and only if for some $n \geq 0$, there exists some sequence $s$ of lambda terms of length $n + 1$ such that $s_{0} \equiv M$ and $s_{n} \equiv N$ and for any $0 \leq i < n$, $s_{i} \rightarrow_{\beta} s_{i + 1}$ or $s_{i + 1} \rightarrow_{\beta} s_{i}$, where $\rightarrow_{\beta}$ is one-step beta reduction.
I would like to prove that $=$ and $=_{\beta}$ are equivalent. To prove that $M =_{\beta} N$ implies $M = N$, we can first prove the following lemma: for any $n \geq 0$, if there exists some sequence $s$ of lambda terms of length $n + 1$ such that $s_{0} \equiv M$ and $s_{n} \equiv N$ and for any $0 \leq i < n$, $s_{i} \rightarrow_{\beta} s_{i + 1}$ or $s_{i + 1} \rightarrow_{\beta} s_{i}$, then $M = N$. This is not hard to prove from mathematical induction using properties of $\rightarrow_{\beta}$.
However, I haven't found a good way to prove the inverse direction: $M = N$ implies that for some $n \geq 0$, there exists some sequence $s$ of lambda terms of length $n + 1$ such that $s_{0} \equiv M$ and $s_{n} \equiv N$ and for any $0 \leq i < n$, $s_{i} \rightarrow_{\beta} s_{i + 1}$ or $s_{i + 1} \rightarrow_{\beta} s_{i}$. It seems to me that from $M = N$, I cannot infer anything that helps me proceed. I tried proving in the other direction: if there is no $n \geq 0$ such that there exists some sequence $s$ of lambda terms of length $n + 1$ such that $s_{0} \equiv M$ and $s_{n} \equiv N$ and for any $0 \leq i < n$, $s_{i} \rightarrow_{\beta} s_{i + 1}$ or $s_{i + 1} \rightarrow_{\beta} s_{i}$, then $M \neq N$. Then the problem is, our definition defines $M = N$, but does not indicate much about $M \neq N$.
Does anyone have any clue about this proof?
When $M = N$ holds, you can actually derive it by means of the inference rules (1)$-$(4) you listed, using them a finite number of times. For instance, you can derive that $(\lambda x.xx)(\lambda y.y) = \lambda y.y$ via the following derivation using the inference rules (1) twice and (4) once, where the conclusions of the two rules (1) are the two premises of the rule (4).
$$ \dfrac{\dfrac{}{(\lambda x.xx)\lambda y.y = (\lambda y.y)\lambda y.y}(1) \qquad \dfrac{}{(\lambda y.y)\lambda y.y = \lambda y.y}(1)}{(\lambda x.xx)\lambda y.y = \lambda y.y}(4) $$
Therefore, you can prove that $M = N$ implies $M =_\beta N$ by induction on the length of the derivation of $M = N$, where the length of the derivation is the number of inference rules you used to derive $M = N$. Let us do it, by considering the last inference rule we used to derive $M = N$. According to the definition of $M = N$, there are four cases:
If the last inference rule is (1), then $M \equiv (\lambda x M_1)M_2 = M_1[x := M_2] \equiv N$. According to the definition of $\to_\beta$, we have $M \to_\beta N$ and hence $M =_\beta N$.
If the last inference rule is (2), then $M \equiv N$ (that is, they are the same term). By applying the definition of $=_\beta$ with $n = 0$, we have that $M = s_0 = N$ and hence $M =_\beta N$.
If the last inference rule is (3), then its premise is $N = M$. By induction hypothesis (since the derivation of $N = M$ has one inference rule less than the derivation of $M = N$), we have $N =_\beta M$. This means that there is a sequence of $\lambda$-terms $s_0, \dots, s_m$ (for some $m \geq 0$) such that $s_0 = N$ and $s_n = M$ and either $s_i \to_\beta s_{i+1}$ or $s_{i+1} \to_\beta s_i$. Thus, the reverse sequence $s_m, \dots, s_0$ allows us to conclude that $M =_\beta N$.
If the last inference rule is (4), then its premises are $M = L$ and $L = N$. By induction hypothesis (since the derivations of $M = L$ and of $L = N$ have one inference rule less than the derivation of $M = N$), we have that $M =_\beta L$ and $L =_\beta N$. This means that there are two sequences of $\lambda$-terms $s_0, \dots, s_\ell$ and $t_0, \dots, t_n$ (for some $\ell,n \geq 0$) such that:
$s_0 = M$ and $s_\ell = L$ and either $s_i \to_\beta s_{i+1}$ or $s_{i+1} \to_\beta s_i$;
$t_0 = L$ and $t_n = N$ and either $t_i \to_\beta t_{i+1}$ or $t_{i+1} \to_\beta t_i$.
Thus, the concatenated sequence $s_0, \dots, s_\ell \equiv t_0, \dots, t_n$ allows us to conclude that $M =_\beta N$.