I have been reading Introduction to Algorithms by Cormen. Before explaining Strassen algorithm the book says this:
Strassen’s algorithm is not at all obvious. (This might be the biggest understatement in this book.)
The book just states the algorithm but don't explain why it works. So if we take the case of multiplication of two $2 \times 2$ matrices, it reduces the number of multiplications from $8$ to $7$ thereby reducing the complexity from $n^{3}$ to $n^{\lg 7}$ which is roughly $n^{2.8}$.
My question is why $7$ and say not $5$ or even lesser number? I mean they could have made the recursion tree even less "bushier".
Allow me to shamelessly copy an answer of mine to this question first.
So in short, $7$ is optimal. It's not obvious that it's optimal. It was a very clever idea, even, to find the $7$.