What is the difference between an $n$-fold product distribution and a joint distribution with $n$ random variables? Is it only defined for independent random variables? I am confused as to what is the definition of a product distribution.
Context: I am reading class notes by John Duchi that say the $KL$-Divergence of product distributions $P = P_1 \times P_2 \ldots P_n$ and $Q = Q_1 \times Q_2 \ldots Q_n$ given by $KL(P || Q)$ satisfies the decoupling equality of being $\sum_{i = 1}^{n}KL(P_i||Q_i)$.
If $X_1,\ldots,X_n$ are real-valued random variables on some $(\Omega,\mathcal F, P)$, then their joint distribution $Q_{joint}$ is the push-forward measure of the $n$-dimensional random variable $(X_1,\ldots,X_n)$, i.e. $$ Q_{joint} \colon \mathcal B (\Bbb R^n)\to [0,1], \quad B \mapsto P\big( (X_1,\ldots,X_n) \in B\big). $$ The product measure $Q_{prod}$ is the unique probability measure $$ Q_{prod} \colon \mathcal B (\Bbb R^n)\to [0,1] $$ with $$ Q_{prod}(B_1\times\ldots\times B_n)= P( X_1\in B_1)\cdots P(X_n\in B_n)$$ for all $B_1,\ldots,B_n\in\mathcal B (\Bbb R)$.
Both $Q_{joint}$ and $Q_{prod}$ are defined regardless of independence. However, the random variables $X_1,\ldots,X_n$ are independent under $P$ if and only if $Q_{joint}=Q_{prod}$.
The same extends to random variables taking values in spaces other than $\Bbb R$.