Let $R$ be a commutative ring with identity and let $M,N$ be $R$-modules. Let $M\otimes N$ be the tensor product of $M$ and $N$. Given an element $x\otimes y \in M\otimes N$, where $x\in M$, $y\in N$, and $a\in R$, why can $a$ act on $x$ as if we were working in $M$, i.e. why can we replace $ax\otimes y$ with $z\otimes y$, where $z = ax \in M$?
A more concrete example is taking $2 \otimes r \in \mathbb{Z} \otimes \mathbb{Z}/2\mathbb{Z}$, where $r\in \mathbb{Z}/2\mathbb{Z}$ is a random element. We have $2\otimes r = 1 \otimes 2r $ (by properties of tensor), but $1\otimes 2r = 1\otimes 0$ since $2r = 0 \in \mathbb{Z}/2\mathbb{Z}$. Here we can see that we swap $2r$ with $0$ since $2$ is a zero divisor.
I think it has something to do with the fact that $\otimes$ is a bilinear map from $M\times N \rightarrow M \otimes N$, but I am not able to get the full picture.
I hope my question is clear.
You're exactly right that this has to do with $M \otimes N$ being the "universal" recipient of a bilinear map from $M \times N$.
Indeed, the key fact is this. If $f : M \times N \to X$ is bilinear, then notice:
$$ f(r \cdot m,n) \overset{(1)}{=} r \cdot f(m,n) \overset{(2)}{=} f(m, r \cdot n) $$
where in $(1)$ we've used linearity in $M$ and in $(2)$ we've used linearity in $N$. In particular, the equation $f(rm,n) = f(m,rn)$ holds in the codomain of every bilinear map.
Now the universal map to the tensor product, $\pi(m,n) = m \otimes n$ is bilinear so that
$$ (rm) \otimes n = \pi(rm,n) = \pi(m,rn) = m \otimes (rn) $$
This is where the relation in question comes from. In fact we can say slightly more -- since the tensor product is the "universal" target of a bilinear map, we learn that the relations in the tensor product are exactly the relations that hold in every target of a bilinear map. That's not useful here, but can be useful in other settings.
In the case of your concrete example, notice that if $f : \mathbb{Z} \times \mathbb{Z}/2 \to X$ is any bilinear map, then we have
$$ f(2,r) = 2f(1,r) = f(1,2r) = f(1,0) $$
but (equational) truth in the tensor product is truth in every bilinear map, and we learn $2 \otimes r = 1 \otimes 0$.
I hope this helps ^_^