Asymptotic behaviour of a multiple integral on the unit hypercube

778 Views Asked by At

A few days ago I found an interesting limit on the "problems blackboard" of my University: $$\lim_{n\to +\infty}\int_{(0,1)^n}\frac{\sum_{j=1}^n x_j^2}{\sum_{j=1}^n x_j}d\mu = 1.$$ The correct claim, however, is: $$\lim_{n\to +\infty}\int_{(0,1)^n}\frac{\sum_{j=1}^n x_j^2}{\sum_{j=1}^n x_j}d\mu = \frac{2}{3}.$$ In fact, following @Tetis' approach we have: $$I_n = \int_{(0,1)^n}\frac{\sum_{j=1}^n y_j^2}{\sum_{j=1}^{n}y_j}d\mu = \int_{(-1/2,1/2)^n}\frac{\frac{1}{2}+\frac{2}{n}\sum_{j=1}^n x_j^2+\frac{2}{n}\sum_{j=1}^n x_j}{1+\frac{2}{n}\sum_{j=1}^n x_j}d\mu,$$ now setting $x_j=-z_j$ and summing the two integrals $$I_n = \int_{(-1/2,1/2)^n}\frac{\frac{1}{2}+\frac{2}{n}\sum_{j=1}^n x_j^2+\frac{4}{n^2}\left(\sum_{j=1}^n x_j\right)^2}{1-\frac{4}{n^2}\left(\sum_{j=1}^n x_j\right)^2}d\mu$$ follows, so: $$ I_n-\frac{2}{3}=\int_{(-1/2,1/2)^n} \left(-\frac{1}{2}+\frac{2}{n}\sum_{j=1}^n x_j^2\right)\frac{\frac{4}{n^2}\left(\sum_{j=1}^n x_j\right)^2}{1-\frac{4}{n^2}\left(\sum_{j=1}^n x_j\right)^2}d\mu < 0,$$ $$ \left| I_n-\frac{2}{3}\right|\leq\int_{\sum x_i^2\leq\frac{n}{4}}\left(\frac{1}{2}-\frac{2}{n}\sum_{j=1}^n x_j^2\right)\frac{\frac{4}{n^2}\left(\sum_{j=1}^n x_j\right)^2}{1-\frac{4}{n^2}\left(\sum_{j=1}^n x_j\right)^2}d\mu,$$ $$ \frac{2}{3}-I_n\leq \frac{n^{n/2}}{2^{n+1}} \int_{\sum x_i^2\leq 1}\left(1-\sum_{j=1}^{n}x_j^2\right)\frac{x_1^2}{1-x_1^2}d\mu.$$ The last bound, anyway, is too crude, since the RHS is $$ \Theta\left(\left(\sqrt{\frac{e\pi}{2}}\right)^n\frac{\log n}{n^{3/2}}\right).$$

My question now is: what is the asymptotic behaviour of $I_n$?

A second one is: can we prove $I_n\geq\frac{2}{3}-\frac{C}{n}$, for a suitable positive constant $C$, without the Central Limit Theorem?


UPDATE: After a few I came out with a proof of my own. The challenge is now to give the first three terms of the asymptotics, and possibly a continued fraction expansion for $I_n$.

6

There are 6 best solutions below

12
On BEST ANSWER

My question now is: what is the asymptotic behaviour of $I_n$?

Being unsure about what @Tetis' three (so far) answers achieve exactly, let me post the asymptotic behaviour @Byron's approach yields, pushing things one step further.

Using the notations in @Byron's post, one sees that $I_n$ is $n$ times the expectation of $X_1^2/S_n$, where $S_n=\sum\limits_{k=1}^nX_k$. Define $Z_n$ by the identity $$S_n=nE[X]+\sqrt{n}\sigma(X)Z_n,$$ where $\sigma^2(X)$ denotes the variance of every $X_k$. Then $Z_n$ converges in distribution to a centered normal random variable. Using the expansion $1/(1+t)=1-t+t^2+o(t^2))$ when $t\to0$, one gets $$ n\frac{X_1^2}{S_n}=\frac{X_1^2}{E[X]}\left(1-\frac{\sigma(X)Z_n}{\sqrt{n}E[X]}+\frac{\sigma^2(X)Z_n^2}{nE[X]^2}+o\left(\frac1n\right)\right), $$ hence $$ I_n=nE\left[\frac{X_1^2}{S_n}\right]=\frac1{E[X]}\left(E[X^2]-\frac{\sigma(X)E[X_1^2Z_n]}{\sqrt{n}E[X]}+\frac{\sigma^2(X)E[X_1^2Z_n^2]}{nE[X]^2}+o\left(\frac1n\right)\right). $$ Expanding $Z_n$, one sees that $$ \sigma(X)\sqrt{n}E[X_1^2Z_n]=E[X^3]-E[X^2]E[X], \qquad E[X_1^2Z_n^2]=E[X^2]+o(1), $$ hence, $$ \lim_{n\to\infty}n\cdot\left(I_n-\ell_X\right)=\kappa_X, $$ where $$ \ell_X=\frac{E[X^2]}{E[X]},\qquad\kappa_X=\frac{E[X^2]^2-E[X^3]E[X]}{E[X]^3}. $$ In the case at hand, $E[X^k]=\frac1{k+1}$ for every positive $k$, hence $$ \lim_{n\to\infty}n\cdot\left(I_n-\frac23\right)=-\frac19. $$

0
On

Let $X_j$ be a sequence of i.i.d. uniform(0,1) random variables, and set $Y_n=\sum_{j=1}^n X^2_j/\sum_{j=1}^n X_j$. The integral in the problem can be expressed as $I_n=\mathbb{E}(Y_n)$.

Note that $0\leq Y_n\leq 1$ almost surely, and that by the strong law of large numbers $$Y_n={\sum_{j=1}^n X^2_j \over \sum_{j=1}^n X_j}= {\sum_{j=1}^n X^2_j \over n}\times {n\over \sum_{j=1}^n X_j}\to {1\over 3}\times{2\over 1},$$ almost surely. By the dominated convergence theorem, we conclude that $I_n\to 2/3$.

4
On

In fact I fully agree with Byron Schmuland. You can see easily that:

$$ \frac{2}{3}=\frac{\int_{(0,1)^n} \sum_{i=1}^{n}x_i^2}{\int_{(0,1)^n}\sum_{i=1}^{n}x_i}\leq\int_{(0,1)^n} \frac{\sum_{i=1}^n x_i^2}{\sum_{i=1}^n x_i} d\mu $$

[Warning: after an hint from @Biron, I edit this. From Chebyshev we can just obtain a lower bound of $\frac{1}{2}$]

$$ \int_{(0,1)^n} \frac{\sum_{i=1}^n x_i^2}{\sum_{i=1}^n x_i} d\mu \leq\int_{(0,1)^n} \sum_{i=1}^n x_i^2d\mu \int_{(0,1)^n}\frac{1}{\sum_{i=1}^n x_i}d\mu = \frac{1}{3} \int_{(0,1)^n}\frac{n \space d\mu}{\sum_{i=1}^n x_i} $$

These are consequences of Chebyshev's integral inequality. But in the last integral you can use the new variable: $$s=\frac{1}{n}\sum_{i=1}^{n}x_i$$ and express the measure: $$d \mu^{(n)} = \frac{d \mu^{(n)}}{ds}ds=p^{(n)}(s)ds$$ where you know $p^{(n)}(s)$ is the multiple convolution of $$p^{(1)}(x)=\chi_{(0,1)}(x)$$ and using tauberian calculus and Laplace transform you can discover that this converges to: $$p^{(n)}(s)=\frac{\sqrt{n}}{\sqrt{2\pi}\sigma}e^{-\frac{n(s-\bar{x})}{2\sigma^2}^2}$$ where: $$\sigma^2=\mathbb{E}[(x-\bar{x})^2]=\frac{1}{12}$$ using the Chebyshev's inequalities you can now easily obtain a good inequality for an upper limit of the integral:

$$\int_{(0,1)^n}\frac{n \space d\mu}{\sum_{i=1}^n x_i} \leq 2 + O\left(\frac{1}{n}\right)$$

so that integral converges to a limit not greater than: $$\frac{2}{3}$$

[After I wrote this I obtained a proof that the limit is $2/3$ following a different approach]

In order to gain a finer knowledge of the asymptotic behavior of $I_n$ you can write down: $$K^{(n)}=\int_{(0,1)^n}\frac{n \space d\mu}{\sum_{i=1}^n x_i}=\int_0^1 \frac{p^{(n)}(s)}{s}ds=2\int_{-1/2}^{1/2}\frac{\bar{p}^{(n)}(x)}{1-2x}dx $$ where $s=1/2+x$ and: $$\bar{p}^{(n)}(x)=p^{(n)}(1/2+x)$$ so that: $$K^{(n)} = 2 \sum_{i=0}^{\infty}\mathbb{E_{p^{(n)}(x)}}[(2x)^{2i}]\simeq 2 + 2\sum_{i=1}^{\infty} \left(\frac{1}{3n}\right)^i(2i-1)!!$$

where the geometric series has been used, and where, because the distribution is pair, the odd moments are missing. In the last expression you can see the gaussian extimation of the moments, however this expression doesn't converge. In fact, increasing the exponent of $x^{2i}$ the peaks of function $x^{2i}\frac{1}{\sqrt{2\pi\sigma}}e^{-\frac{x^2}{2\sigma^2}}$ are $\pm\sigma\sqrt{2i}$, so the gaussian extimation is accurate only if the peaks are in the range of variation of our variable x: $\frac{\sqrt{2i}}{\sqrt{12n}} \ll \frac{1}{2} $ or, in other terms: $i \ll 6n$. In order to obtain a non diverging extimation of moments for bigger value of i a way is to consider the partial moments of gaussian approximation in the restricted range $(-\frac{1}{2}+\frac{1}{2(n-1)},\frac{1}{2}-\frac{1}{2(n-1)})$ and to consider the right volume in the range $(-\frac{1}{2},-\frac{1}{2}-\frac{1}{2n})$ which amount exactly to $\frac{(x+1/2)^{n-1}}{(n-1)!}dx$. The asymptotic behavior of the partial moments results now upper bounded in fact, if we assume the meausure concentrate in extremum of interval, we obtain: $$\mathbb{E}_{p^{(n)}(x)}[x^{2i}] \leq 2\left(\frac{1}{2}-\frac{1}{2n}\right)^{2i}$$ and the contribute of the two symmetric symplectic range in the hypercube is bounded from: $$ \frac{1}{n!(2n)^n}+\frac{1}{2(n-1)^{n-1}(n-1)(n-1)!} $$

0
On

In order to rectify some statements I made in my last answer, and to sketch the way to progress in the extimation of asymptotic behaviour of $I_n$ I'm writing this my second answer.

1) because I was thinking the integral was lower bounded by $2/3$ I supposed the integral was definitely decreasing because it is upper bounded in term of $\int_{(0,1)^n} \frac{n d \mu}{\sum_{i=1}^n x_i}$

anyway a comment of Byron Schmuland showed me I was wrong. $2/3$ seems to be an upper bound in fact. A way to show this and to search an extimation for the asymptotic behavior of the sequence is to rewrite the integral as follow:

$$\int_{(0,1)^n} \frac{\sum_{i=1}^n x_i^2}{\sum_{i=1}^n x_i} d\mu^{(n)}=\int_{(0,1)^n} \frac{n x_n^2}{\sum_{i=1}^n x_i} d\mu^{(n)} =\int_0^1 dx \left(\int_{(0,1)^{n-1}} \frac{n x^2}{x+\sum_{i=1}^{n-1} x_i} d\mu^{(n-1)} \right)$$

I used the symmetry of the integral w.r.t. the $i$-index permutations. The inner integral in the last expression became:

$$\int_{(0,1)^{n-1}} \frac{n x^2}{x+\sum_{i=1}^{n-1} x_i} d\mu^{(n-1)} =\int_{-\frac{1}{2}}^{\frac{1}{2}} \frac{n x^2 \bar{p}_{n-1}(y)}{x+(n-1)(1/2+y)} dy $$

with the following position: $ \sum_{i=1}^{n-1} x_i = (n-1) \left(y + \frac{1}{2}\right) $ and the following definition $\bar{p}_{n-1}(y)=d \mu^{(n-1)}/dy $ that is the density of the measure w.r.t. the y variable. Now we will use the geometric series. In order to restrict ourself to a close range properly included in the internal of the segment of convergence for the geometric series we can split the integration interval in three intervals as follow:

$$ \left(\int_{-\frac{1}{2}}^{-\frac{1}{2}+\frac{1}{2(n-1)}} + \int_{-\frac{1}{2}+\frac{1}{2(n-1)}}^{\frac{1}{2}-\frac{1}{2(n-1)}} +\int_{\frac{1}{2}-\frac{1}{2(n-1)}}^{\frac{1}{2}} \frac{n x^2 \bar{p}_{n-1}(y)}{x+(n-1)(1/2+y)} dy \right)$$

we now take advantage from the knowledge of the density $$\bar{p}_{n-1}(y)=\frac{(y+1/2)^{n-2}}{(n-2)!} \forall y \in\left(-\frac{1}{2},-\frac{1}{2}+\frac{1}{2(n-1)}\right)$$ and of its parity (the function is even) in order to express the first and last integral. It's easily seen that this integrals forms two infinitesimal sequences. The second integral can be rewritten as follow:

$$\left(\frac{nx^2}{x+\frac{n-1}{2}}\right) \sum_{k=0}^{\infty} \left(\frac{n-1}{x+\frac{n-1}{2}}\right)^{2k} \mathbb{E}_{\bar{p}^{(n-1)(x)}* \chi_{\left(-\frac{1}{2}+\frac{1}{2(n-1)},\frac{1}{2}-\frac{1}{2(n-1)}\right)}}[y^{2k}]$$

The integral w.r.t. x of the leading term of expansion is less than $2/3$ and converge to $2/3$ increasing with $n$. In the interval of interest the function of y which appear in the series converges uniformly to zero for each n because this interval is internal to the circle of convergence of the series.

0
On

$$ \int_{(0,1)^n} \frac{\sum_i y_i^2}{\sum_i y_i} d \mu = \int_{\left(-\frac{1}{2},\frac{1}{2}\right)^n} \frac{\sum_i\left(\frac{1}{2}+x_i \right)^2}{\sum_i \left(\frac{1}{2}+x_i \right)} d\mu$$

developing the squares and collecting the terms, then rearranging the denominator in order to evidentiate a geometric series:

$$=\int_{\left(-\frac{1}{2},\frac{1}{2}\right)^n}\left(\frac{1}{2}+2\frac{\sum_i x_i^2}{n}+ 2\frac{\sum_i x_i}{n} \right)\sum_{k=0}^\infty\left(-\frac{2\sum_ix_i}{n}\right)^kd\mu=$$ $$=\frac{2}{3}+\int_{\left(-\frac{1}{2},\frac{1}{2}\right)^n}\left(\frac{1}{2}+2\frac{\sum_i x_i^2}{n}+ 2\frac{\sum_i x_i}{n} \right)\sum_{k=1}^\infty\left(-\frac{2\sum_ix_i}{n}\right)^k d\mu$$ Now we are able to construct a series for $I_n$. At first we need to address the calculation of the following type of "expectation values":

$$\mathbb{E}[(x+y+z)^2]=3 \langle x^2\rangle = 3/12 $$

$$\mathbb{E}[(x+y+z)^6] = 3 \langle x^6\rangle + 6 \cdot 15 \langle x^4y^2\rangle + 90 \langle x^2 y^2 z^2\rangle$$

those are integrals in the volume $(0,1)^n$ we can reduce it to single integrals in a very simple way: we have to observe that for odd exponent the expectation value of the single variable is $0$; moreover, if a variable appears in a product we can factorize the integral using Fubini theorem in order to obtain the product of expectation values. So that we have:

$$\mathbb{E}[(x+y+z)^6] = 3 <\langle x^6\rangle + 6 \cdot 15 \langle x^4\rangle \langle y^2\rangle + 90 \langle x^2\rangle \langle y^2\rangle \langle z^2\rangle = \cdots$$

the two factors take in account the multiple choice of the factors wich furnish the variable picked out and the multiple choice of the different variables equivalent to that we choice to represents the Others.

$$\mathbb{E}\left[ \left(\sum_ix_i\right)^{2k} \right]=\sum_{|\bar{q}|=0}^{k}\mathbb{E}\left[\prod_{i=1}^{|\bar{q}|}x_i^{2q_i}\right]=\sum_{|\bar{q}|=0}^{k}\frac{{n \choose |\bar{q}|}}{2^{2k}\prod_{i=1}^{|\bar{q}|}(2q_i+1)}$$

Here $\bar{q}$ represents the set of all the vectors of positive natural numbers subjects to the constraint that the dimension of the vector $|\bar{q}|$ is not greater than $k$ and the sum $\sum_{i=1}^{|\bar{q}|}q_i = k$

We can notice that the maximum grade of the moltiplicity in the moment of order $2k$ is $n^k$ so that our series, in the integral, if it converges (and it converges because the integral at the beginning of the whole calculation converges, just like converges the geometric series inside the integral so that the Lebesgue theorem of dominate convergences grant the convergence of the series of the definite integrals to the same value of the integral of the series), is a Laurent series of a function of n, depending only on the inverse powers of n.

We can write the structure of the general moment as follow:

$$\mathbb{E}\left[ \left(\sum_ix_i\right)^{2k} \right]=\sum_{i=1}^k {n \choose i}\frac{c_i}{2^{2k}}=\frac{Q_{k}(n)}{2^{2k}}$$ that is a positive polynomial in n of grade q. The coefficient of the leading term in the polynomial $Q_k(n)$ is $\frac{(2k)!}{k! 2^k}$ An analogous reasoning will regard the other moment we need:

$$\mathbb{E}\left[ \sum_j x_j^2 \left(\sum_ix_i\right)^{2k} \right]= n \mathbb{E}\left[x_1^2 \left(\sum_i x_i\right)^{2k}\right]=\frac{nP_k(n)}{2^{2(k+1)}}$$

and so:

$$I_n = \frac{2}{3}-\frac{1}{2}\left(\sum_{k=1}^\infty\frac{Q_k(n)}{n^{2k}}-\sum_{k=1}^{\infty}\frac{P_k(n)}{n^{2k}}\right)$$

Here is a list of the first few polynomials:

  1. $Q_1(n) = \frac{n}{3};\space \space$ $Q_2(n) =\frac{n}{5}+\frac{n(n-1)}{3}$
  2. $P_1(n) = \frac{1}{5}+\frac{n-1}{9};\space$ $P_2(n) =\frac{1}{7}+\frac{7(n-1)}{15}+\frac{(n-1)(n-2)}{27}$

Now my question is: "are these polynomials known?" "Is there a way to simplify the series?" "How fast the series converges?" in order to address the last question we need an extimation of the coefficients for the inverse powers of n.

I remember by the way that to obtain the moments we are searching for is usual to write down the generating function for our single variable distribution, which is:

$$ \int_{-1/2}^{1/2}e^{its}dt = \frac{2sen(s/2)}{s}$$

so that the generating function for the sum is:

$$ \hat{g}_n(s)=\left( \frac{sen(s/2)}{s/2} \right)^n$$

and the moment:

$$\mathbb{E}[\sum_i(x_i)^k] = \frac{k!}{(is)^k}\hat{g}_n^k(s)$$

anyway isn't a simple matter to obtain the series for a big power of a known series.

1
On

Another probabilistic approach, with $x_i$ being iid uniform(0,1) random variables. To evaluate $\displaystyle E\left[\frac{\sum x_i^2}{\sum x_i} \right]$, we can use a Taylor expansion in two variables.

Suppose $s$ and $t$ are two random variables with joint density $f(s,t)$ and mean $s_0$, $t_0$. Then, for a function $g(s,t)$ well behaved around $(s_0,t_0)$ we can write:

$$ \begin{align} E[g(s,t)]&=\int g(s,t) f(s,t) ds dt \\&= g(s_0,t_0)+ \frac{1}{2}Var(s) g_{ss}(s_0,t_0)+ \frac{1}{2}Var(t) g_{tt}(s_0,t_0)+ Cov(s,t)g_{st}(s_0,t_0) + \cdots \end{align} $$

because the linear terms vanish.

Then, let $$t=\frac{\sum_{i=1}^n x_i}{n} \\s=\frac{\sum_{i=1}^n x_i^2}{n}$$

And let $$g(s,t)=\frac{s}{t}$$

Then it's straightforward: $s_0 = 1/3$, $t_0 = 1/2$, $Var(t)=1/(12 n)$, $E(s t) =1/6+1/(12 n)$, $Cov(s,t) = 1/(12 n)$.

Further, $g_{ss}=0$, $g_{tt}=2 s/ t^{3}$, $g_{st}=-1/t^2$. Hence $$ E\left[\frac{\sum x_i^2}{\sum x_i} \right]=E\left[\frac{s}{t} \right] \approx \frac{s_o}{t_0} + \frac{1}{2}\frac{2 s_o}{t_0^3}Var(t) - \frac{1}{t_0^2}Cov(s,t)=\\ =\frac{2}{3} - \frac{1}{9 n}$$

What remains is to show that the next terms of the Taylor expansion are $o(n^{-1})$. One could also compute them, of course.