We know that the Y-combinator is defined as: $$\text{Y}:=\lambda f.(\lambda x.f(xx))(\lambda x.f(xx))$$
Wikipedia says :$$\text{Y}:=\text{S(K(SII))(S(S(KS)K)(K(SII)))}$$
Now the question is: What logical steps can we take to convert the first definition to the second?
While it is easy to show the equivalence between the two definitions, finding how the first definition can motivate and lead to the second definition is, in my opinion, a tricky task. I have added my proof as an answer, but all other ideas and suggestions are welcome.
Let's define $$\text{E}=\lambda\text{x. f (x x)}$$ which leads to:
$$ \begin{align*} \text{E x}&=\text{f (x x)}\\ &=\text{f (I x (I x))}\\ &=\text{f (S I I x)}\\ &=\text{(K f x) (S I I x)}\\ &=\text{S (K f) (S I I) x}\\ &=\text{(K S f) (K f) (S I I) x}\\ &=\text{S (K S) K f (S I I) x}\\ &=\text{S (K S) K f (K (S I I) f) x}\\ &=\text{S (S (K S) K)(K (S I I)) f x}\\ \therefore \text{ E}&=\text{S (S (K S) K)(K (S I I)) f}\\ &=\text{T f [Let]} \end{align*} $$
Now $\text{Y}=\lambda\text{f. E E}$, so: $$\begin{align*} \text{Y f}&=\text{E E}\\ &=\text{T f (T f)}\\ &=\text{S T T f}\\ \therefore\text{ Y}&=\text{S T T}\\ &=\text{S S I T}\\ &=\text{S S I (S (S (K S) K)(K (S I I)))}\\ &=\text{S (K (S I I)) (S (S (K S) K)(K (S I I)))} \end{align*}$$
Note: See this for why $\text{SSI}$ and $\text{S(K(SII))}$ are equivalent.