How can I prove this proposition in probability without mentioning discrete distributions?

50 Views Asked by At

The average number of children in the family to which the average child belongs is more than the average number of children in the average family.

That proposition is often presented in elementary courses on probability or statistics. Now here's a stronger proposition that, somewhat strangely, I never thought about until recently. Let $X$ be the number of children in a uniformly distributed random family. Let $Y$ be the number of children in the family of a uniformly distributed random child. (The difference is that with $X$ the families are equally probable and with $Y$ the children are equally probable.) Then $$ \operatorname E(Y) = \frac{\operatorname E(X^2)}{\operatorname E(X)}. $$ This is true of the more general proposition in which $X$ is not a discrete number of children but is some continuously distributed random variable.

The occasion for this posting is that I am not as satisfied with the way I proved this as I would like to be. In particular, I would like to avoid mentioning any discrete distributions except that of $X$ when that is discrete, the proposition being this:

Let $X_i,\,\,i=1,2,3,\ldots$ be i.i.d. nonnegative random variables. Letting $X_i$ be the length of the $i$th in a sequence of line segments a few light-years long, so that $X_n$ is the length of the the last one, choose a uniformly distributed point in that long concatenated segment. Let $Y$ be the length of segment $I,$ in which that point falls. This makes $I\in\{1,\ldots,n\}$ a random variable and $Y= X_I.$ And $I$ is not uniformly distributed because a longer segment is more probable than a shorter one. Then $$ \Pr(I = i) = \frac{X_i}{X_1+\cdots+X_n}. $$ Therefore \begin{align} & \operatorname E(X_I\mid X_1,\ldots,X_n) \\[8pt] = {} & \sum_{i=1}^n X_i \Pr(I=i) = \sum_{i=1}^n X_i\cdot\frac{X_i}{X_1+\cdots +X_n} \\[8pt] = {} & \frac{X_1^2+\cdots+X_n^2}{X_1+\cdots+X_n} \to \frac{\operatorname E(X^2)}{\operatorname E(X)} \text{ as } n\to\infty. \end{align} I would rather see a proof that does not need to approximate a continuous distribution with a discrete distribution and then take a limit as $n\to\infty.$ I'm guessing there's a way to write this that will make me wonder why I didn't immediately think of it. What is it?

1

There are 1 best solutions below

3
On

I do not quite understand your uniformly distributed families or children or light-years. So let's try to rewrite your question in a way which removes randomness and uses averages rather than expectations:

  • Suppose you have a sequence of $n$ non-negative values $x_1, x_2,\ldots, x_n$ with partial sums $s_j= \sum\limits_1^j x_i$

  • and then you consider a function on $(0,s_n]$ where $f(y) = x_j$ when $s_{j-1}< y \le s_j$

  • You would then have

$$\int\limits_0^{s_n} f(y) \, dy = \int\limits_0^{s_1} x_1 \, dy +\int\limits_{s_1}^{s_2} x_2 \, dy + \cdots + \int\limits_{s_{n-1}}^{s_n} x_n \, dy = \sum\limits_1^n x_j^2.$$

  • so you might say the average value of $f(y)$ over $(0,s_n]$ is $\dfrac{\int\limits_0^{s_n} f(y)}{s_n} = \dfrac{\sum\limits_1^n x_j^2}{\sum\limits_1^n x_i}$, which is apparently the result you are looking for for your line segments.

For integers rather than real values:

  • you can have almost exactly the same result for $n$ non-negative integers $x_1, x_2,\ldots, x_n$ except that now $f(y)$ is defined on the integers $1,2,\ldots,n$, producing the same expression for the average of $f(y)$ over possible values of $y$ since you have $$\sum\limits_{y=1}^{s_n} f(y) = \sum\limits_{y=1}^{s_1} x_1 +\sum\limits_{y=s_1+1}^{s_2} x_2 + \cdots +\sum\limits_{y=s_{n-1}+1}^{s_n} x_n = \sum\limits_1^n x_j^2.$$

  • As an aside, this integer result can be translated into saying that if the population of families has a number of children per family averaged over the families of $\mu =\frac1n \sum\limits_1^n x_i$ with variance $\sigma^2 = \frac1n \sum\limits_1^n \left(x_j-\mu\right)^2= \frac1n \sum\limits_1^n x_j^2 - \mu^2$, then the number of children per family averaged over the children is $\frac{\sum\limits_{y=1}^{s_n} f(y)}{s_n} = \frac{\sum\limits_1^n x_j^2}{\sum\limits_1^n x_i} = \frac{\mu^2+\sigma^2}{\mu} =\mu+\frac{\sigma^2}{\mu}$, quantifying the difference between the two averages.