My textbook, Introduction to Probability by Blitzstein and Hwang, defines LOTUS (law of the unconscious statistician) as follows:
Theorem 4.5.1 (LOTUS). If $X$ is a discrete r.v. and $g$ is a function from $\mathbb{R}$ to $\mathbb{R}$, then
$$E(g(X)) = \sum_x g(x) P(X = x),$$
where the sum is taken over all possible values of $X$.
This means that we can get the expected value of $g(X)$ knowing only $P(X = x)$, the PMF of $X$; we don't need to know the PMF of $g(X)$. The name comes from the fact that in going from $E(X)$ to $E(g(X))$ it is tempting to just change $x$ to $g(x)$ in the definition, which can be done very easily and mechanically, perhaps in a state of unconsciousness. On second thought, it may sound too good to be true that finding the distribution of $g(X)$ is not needed for this calculation, but LOTUS says it is true.
Before proving LOTUS in general, let's see why it is true in some special cases. Let $X$ have support $0, 1, 2, \dots$ with probabilities $p_0, p_1, p_2, \dots$, so the PMF is $P(X = n) = p_n$. Then $X^3$ has support $0^3, 1^3, 2^3, \dots$ with probabilities $p_0, p_1, p_2, \dots$, so
$$\begin{align} &E(X) = \sum_{n = 0}^\infty n p_n, \\ &E(X^3) = \sum_{n = 0}^\infty n^3 p_n \end{align}$$
As claimed by LOTUS, to edit the expression for $E(X)$ into an expression for $E(X^3)$, we can just change the $n$ in front of the $p_n$ to an $n^3$; the $p_n$ is unchanged, and we can still use the PMF of $X$. This was an easy example since the function $g(x) = x^3$ is one-to-one. But LOTUS holds much more generally.
This begs the question, which does not seem to be answered by the author: What happens in the case where the function $g(x)$ is not one-to-one? How does one deal with this case? How does our application of LOTUS change?
I would greatly appreciate it if people could please take the time to explain this.