On the proof of convergence of Pòlya urns

81 Views Asked by At

I'm reading the proof of Proposition $2$ of the Appendix here. The proposition states

Let $d\ge 2$ and $S\ge 1$ be integers. Let also $(\alpha_1,\dots,\alpha_d)\in\mathbb{N}^d \setminus\{ 0 \}$. Let $(P_n)_{n\ge0}$ be the $d$-color Pòlya urn random process having $S\cdot Id$ as replacement matrix and $(\alpha_1,\dots,\alpha_d)$ as initial composition. Then, almost surely and in any $L^t$, $t\ge 1$, $$\frac{P_n}{nS}\to V\ \ \text{as}\ n\to\infty$$ where $V$ is a $d$-dimensional Dirichlet-distributed random vector, with parameters $(\frac{\alpha_1}S,\dots,\frac{\alpha_d}S)$.

What I know about the distribution of $P_n$ is that $\forall n\ge0$, let $X_{n+1}$ be a random variable in $\{ 1,\dots,d\}$ with distribution $\mathbb{P}(X_{n+1}=k|P_n)=\frac{(P_n)_k}{\|P_n\|_1}\ \forall k=1,\dots,d$, where $(P_n)_k$ denotes the number of balls of colour $k$ at time $n$ and $\|P_n\|_1=\sum^d_{k=1}(P_n)_k$. Notice that in this script it holds $(P_0)_k=\alpha_k \forall k=1,\dots,d$ and so $||P_0||_1=\sum^d_{k=1}\alpha_k=:\alpha$. We can also write $P_n=\sum^d_{k=1}(P_n)_k e_k$, where $e_k$ is the $k$-th vector in $\mathbb{R}^d$ canonical basis.

The proof starts with: Conditional expectation at time $n+1$ writes $$\mathbb{E}[P_{n+1}|\mathcal{F}_n]=\frac{\alpha+(n+1)S}{\alpha+nS}P_n\tag{1}$$ so that $\big(\frac{P_n}{\alpha+nS}\big)_{n\ge0}$ is a $[0,1]^d$-valued convergent martingale with mean $(\alpha_1/\alpha,\dots,\alpha_d/\alpha)$; let $V$ be its limit. All ok until now, now the proof proceeds with:

If $f$ is any function defined on $\mathbb{R}^d$, $$\mathbb{E}[f(P_{n+1})|\mathcal{F}_n]=\big(I+\frac{\Phi}{\alpha+ nS}\big)(f)(P_n) \tag{2}$$ where $$\Phi(f)(v) =\sum^d_{k=1}v_k[f(v+Se_k)−f(v)]$$ ($e_k$ is the $k$-th vector in $\mathbb{R}^d$ canonical basis and $v=\sum^d_{k=1}v_ke_k$).

My problem is how to get to equation $(2)$ from equation $(1)$. Explicitely, applying $f$ to $(1)$ we have: $$\mathbb{E}[f(P_{n+1})|\mathcal{F}_n]=f\big(\big(1+\frac{S}{\alpha+nS}\big)P_n\big)\tag{3}$$ For the latter term, I don't know how to proceed for any $f$, if it was linear, then $(3)=f(P_n)+\frac1{\alpha+nS}f(S\cdot P_n)$ and so there are some similarities with:$$\big(I+\frac{\Phi}{\alpha+nS}\big)(f)(P_n)=f(P_n)+\frac1{\alpha+nS}\Phi(f)(P_n)=f(P_n)+\frac1{\alpha+nS}\big(\sum^d_{k=1}(P_n)_k[f(P_n+Se_k)-f(P_n)]\big)\tag{4}$$

Am I missing something or did I do something wrong?

1

There are 1 best solutions below

2
On BEST ANSWER

You can't get from (1) to (2). Rather, both follow from knowledge of the conditional distribution of $P_{n+1}$ given $\mathcal F_n$. With your notation: $$ P_{n+1}=P_n+S\sum_{k=1}^d 1_{\{X_{n+1}=k\}}e_k. $$