I am reading a machine learning paper, and this paragraph below doesn't quite make sense. How is x+y (mod p−1) and x*y (mod p) equivalent? Suppose p = 5 and x = 3 and y = 4, then clearly 7 mod 4 =3 isn't equal to 12 mod 5 = 2. I'm also not sure how the line "every nonzero residue modulo a prime can be represented as a power of a primitive root." relates to this fact.
Since the operands are presented to the neural network as unrelated abstract symbols, the operations x+y (mod p−1) and x*y (mod p) with a prime number p and non-zero x, y are indistinguishable from the neural network’s perspective (and similarly x − y (mod p − 1) and x/y (mod p)). This is because every nonzero residue modulo a prime can be represented as a power of a primitive root.
Working with your example with $p=5$, observe that $2$ is a primitive root. That means its powers yield all the nonzero congruence classes: $$ 2^0 \equiv 2^4 \equiv 1,\ 2^1 \equiv 2,\ 2^2 \equiv 4,\ 2^3 \equiv 3 \pmod{5}. $$ Since $2^4 \equiv 1 \pmod{5}$ the exponents work modulo $4$. $$ $$ So to multiply modulo $5$ we add exponents modulo $5-1 = 4$: $$ 3 \times 4 \equiv 2^3 \times 2^2 = 2^5 \equiv 2^1 = 2 \pmod{5}. $$