Let two $\mathbf{N}$-valued random variables $X$ and $Y$ be given, and let $\phi_X(s) = \sum_k \mathbf{P}(X = k) s^k$ and $\phi_Y(s) = \sum_k \mathbf{P}(Y = k) s^k$ be their respective probability generating functions.
It is stated in Probability and Statistics by Example, Suhov and Kelbert, p. 59, that the following two conditions are equivalent.
(i) $X$ and $Y$ are independent.
(ii) $\phi_{X + Y}(s) = \phi_X(s) \phi_Y(s)$.
I don't have a problem with the fact that (i) implies (ii). However, I don't understand why the converse is true.
The only justification offered in the book is that it follows from the uniqueness of the coefficients of a power series. But I don't understand why the fact that $X + Y$ has the same distribution as if $X$ and $Y$ were independent ought to imply that they are in fact independent.
Can anybody fill in the details of the argument, or else refer me to an easily accessible source? Thanks.
$\begin{align} \text{Given that}: \\ \phi_{X+Y}(s) & = \sum_{k\in\mathcal D(X+Y)} \mathsf P(X+Y=k)\; s^k \\ & = \sum_{x\in\mathcal D(X)} \;\sum_{y\in \mathcal D(Y)} \mathsf P(X=x, Y=y)\; s^{x+y} \\[2ex] \text{And that} \\ \phi_X(s)\phi_Y(s) & = (\sum_{x\in\mathcal D(X)} \mathsf P(X=x)\; s^x )\cdot( \sum_{y\in\mathcal D(Y)} \mathsf P(Y=y)\; s^y) \\ & = \sum_{x\in\mathcal D(X)}\sum_{y\in\mathcal D(Y)} \mathsf P(X=x)\mathsf P(Y=y)\; s^{x+y} \\[2ex] \text{And that independence means:} \\ X\bot Y & \iff \mathsf P(X=x , Y=y) = \mathsf P(X=x)\mathsf P(Y=y) \\[2ex] \text{Therefore } & \text{ for all X, Y using that probability generating function:} \\ & \text{ Independence is a necessary and sufficient condition to declare that} \\ & \text{ the the pgf of the sum, $X+Y$, is equal to the product of the pgfs of $X$ and $Y$.} \\ X\bot Y & \iff \phi_{X+Y}(s) = \phi_X(s)\phi_Y(s) \end{align}$
Edit Summary
Updated to be clear about the domains of the sum, and why we can switch from summing all k in the domain of X+Y to double summing over all x in the domain of X and all y in the domain of Y. It's basically about expectations.
$$\begin{align} \sum_{z\in \mathcal D(X+Y)} s^z\;\mathsf P(X+Y=z) & = \sum_{z\in \mathcal D(X+Y)} s^z\; \sum_{x\in\mathcal D(X)} \mathsf P(X=x\cap X+Y=z) & \text{Law of Total Probability} \\ & = \sum_{z\in \mathcal D(X+Y)} s^z \sum_{x\in\mathcal D(X)} \mathsf P(X=x\cap Y=z-x) & \text{Equivalence of events} \\ & = \sum_{x\in\mathcal D(X)} \sum_{z\in \mathcal D(X+Y\mid X=x)} s^{z} \mathsf P(X=x\cap Y=z-x) & \text{by reordering the summations} \\ & = \sum_{x\in\mathcal D(X)} \sum_{y\in \mathcal D(Y)} s^{x+y} \mathsf P(X=x \cap Y=y) & \text{by change of index} \\[3ex] \text{Alternatively:} \\ \mathsf E_{X+Y}[s^{X+Y}] & = \mathsf E_X[\mathsf E_{X+Y\mid X}[s^{X+Y}\mid X]] & \text{Conditional Expectation} \\ & = \mathsf E_X[s^X \mathsf E_{Y\mid X}[s^{Y}\mid X]] & \text{Linearity of Expectation} \\[3ex] \therefore \mathsf E_{X+Y}[s^{X+Y}] & = \mathsf E_X[s^X \mathsf E_Y[s^Y]] & \text{by independence} \\ & = \mathsf E_X[s^X]\times \mathsf E_Y[s^Y] & \text{by linearity of expectation} \end{align}$$