I am having trouble understanding why convolution is well-defined.
Let's take a simple example:
$(\Omega, \mathcal{F}, P)$ probability space and $X_{1}, X_{2}$ two real random variables where $P(X_{1}=3)=\frac{1}{2},P(X_{2}=2)=\frac{1}{4}$
And $P(X_{1}=1)=\frac{1}{5},P(X_{2}=4)=\frac{1}{3}$
Then my understanding of convolution is
$P_{X_{1}+X_{2}}\circ A^{-1}$ where $A: X_{1} \times X_{2}\to \mathbb R,A(x_{1},x_{2})=x_{1}+x_{2}$
So surely, if, for instance $X_{1}+X_{2}=5$, I get more than one preimage, and hence how can convolution be well-defined?
In the above case, I would get:
$P_{X_{1}+X_{2}}\circ A^{-1}(5)=P_{X_{1}}(3)P_{X_{2}}(2)=\frac{1}{2}\times\frac{1}{4}=\frac{1}{8}$ while
$P_{X_{1}+X_{2}}\circ A^{-1}(5)=P_{X_{1}}(1)P_{X_{2}}(4)=\frac{1}{5}\times\frac{1}{3}=\frac{1}{15}$
I do not know where I am going wrong in my understanding of convolution. Any help is greatly appreciated.
If $P$ denotes the probability measure on $(\Omega,\mathcal F)$ then it induces for every random variable $Z$ a probability measure $P_Z$ on $(\mathbb R,\mathcal B)$ that is prescribed by:$$B\mapsto P(\{\omega\in\Omega\mid Z(\omega)\in B\})=P(\{Z\in B\})=P(Z\in B)$$
Here $\{Z\in B\}$ abbreviates $\{\omega\in\Omega\mid Z(\omega)\in B\}$ and $P(Z\in B)$ abbreviates $P(\{Z\in B\})$
So we have $P_Z(B)=P(\{Z\in B\}$ for Borel subsets of $\mathbb R$.
Another notation of this probability is $PZ^{-1}$ prescribed by:$$B\mapsto P(Z^{-1}(B))=P(\{\omega\in\Omega\mid Z(\omega)\in B\})=P(\{Z\in B\})=P(Z\in B)$$
Observe that $P_Z$ and $PZ^{-1}$ are notations for the same measure.
Also random vector $(X_1,X_2)$ induces a probability measure.
This time denoted as $P_{(X_1,X_2)}$ and defined on $(\mathbb R^2,\mathcal B^2)$.
If $A:\mathbb R^2\to\mathbb R$ is prescribed by $(x,y)\mapsto x+y$ then $A$ is a Borel-measurable function.
That means that it can be looked at as a random variable on space $(\mathbb R^2,\mathcal B^2,P_{(X_1,X_2)})$.
Applying the principle that was mentioned above on space $(\mathbb R^2,\mathcal B,P_{(X_1,X_2)})$ we have measure $P_{(X_1,X_2)}A^{-1}$ on $(\mathbb R,\mathcal B)$ and it is not difficult to deduce that:$$P_{X_1+X_2}=P_{A\circ(X_1,X_2)}=P_{(X_1,X_2)}A^{-1}$$
In your question you mix up the two notations, and this can be a source of confusion on its own.
If $X_1,X_2$ are random variables then so is $X_1+X_2$.
This with e.g.:$$P_{X_1+X_2}(\{5\})=P(X_1+X_2\in\{5\})=P(X_1+X_2=5)$$
If moreover $X_1,X_2$ only take integers as value then this can be expanded to:$$\cdots=\sum_{n,m\in\mathbb Z\wedge n+m=5}P(X_1=n\wedge X_2=m)$$