I have this confusion. Why do we express a nim sum as powers of 2 and why do nim sums cancel in pairs of 2 only?
For instance, let's take the nim game(6,10,15)
Now clearly
*6 = * $2^2$ + * $2^1$
*10 = * $2^3$ + * $2^1$
*15 = * $2^3$ + * $2^2$ + * $2^1$ + * $2^0$
Then the nim sum will be given by *6 + *10 + *15 = *3 but why is it that only powers of 2 cancel in pairs? Any player can remove one single nim from the first heap of 6 or may be even 5 but it looks as if the player is going to remove only 4 ( $2^2$) or 2 ($2^1$) nims only? Why is that so?
And also we don't we write *6 as sum of 5 and 1 or 3 and 3.. Now if express all 3 in this way and add any three pairs then the sum is no longer *3
As Brian said, a rigorous proof that this powers-of-two strategy works would be quite long. I'll try to provide a bit of intuition, though with a piece of a proof.
When you have a game of nim with two equal piles, the next player to move won't win (if the opponent plays well) because the other player could always mirror moves. Now, we can say a bunch of piles is like a single pile when the next player would lose if you added the single pile to the bunch. For example, (1,5) is like 4 since the second player can win (1,5,4).
The key fact then is that $(m,2^n)$ is like $m+2^n$ if $2^n>m$. You can prove that the second player can win $(m,2^n,m+2^n)$ by induction. It's clear for m=0, certainly.
Otherwise, the first move might be in the biggest pile, leaving it at least as large as $2^n$, and then the second player can make the smallest pile equal to the difference or the other two, and they then have a winning strategy by inductive hypothesis (m is now lower).
In the other case, if the first move leaves only one pile that's at least as big as $2^n$, then you can always move in that pile to the binary xor (the binary sum without carrying) of the other two piles. By using the inductive hypothesis repeatedly (lower values of n), you can pull out and cancel the highest powers of two that appear to see that the second player has a mirroring-like strategy.
But why wouldn't something like this work for, say, $3^n$? Basically because numbers aren't sums of distinct powers of 3, if the leading term of a base three expansion is $2*3^n$, then since two piles that are the same power of three would cancel, we can't hope to get as clean a decomposition that way.
What about something with no 2s, like the fibonacci base? Then either representations aren't unique or aren't fully arbitrary, so you can't move to an arbitrary xor.