Maybe an answer to this question is that we want the successor to have " one more " element than the prececessor.
Is this explanation correct?
An objection I see is that the explanation is not sufficient, since, maybe another singleton set fused with n ( by union) could have provided this " one more" element.
So, why precisely {n} as second term of the union operation? Why not another singleton?
We want to define the natural numbers such that the order will be naturally given by $\in$. I.e. we want that $n<m$ (as à number) if and only if $n \in m$ as sets. And thats the motivation to add precisely $\{n\} $ and not any other singleton.
The encoding described is known as the Von Neumann ordinals. An alternative encoding that is also valid but doesn't satisfy the above property is the Zermelo ordinals.