The "turning-point fraction" of a random sample from a discrete distribution must have expectation less than 2/3?

136 Views Asked by At

A sequence of reals $x_1,...,x_n$ is said to have a turning point at index-value $i$ ($1\lt i\lt n$) iff $x_{i-1}\lt x_{i}\gt x_{i+1}$ or $x_{i-1}\gt x_{i}\lt x_{i+1}$. The number of turning points in the sequence is denoted $T(x_1,...,x_n)$, and we define the turning-point fraction as $$R(x_1,...,x_n)={\text{number of turning points}\over\text{number of potential turning points}}={T(x_1,...,x_n)\over n-2}$$ so $0\le R(x_1,...,x_n)\le 1.$ If $X_1,...,X_n$ are random variables, we define the corresponding r.v.s $T_n=T(X_1,...,X_n)$ and $R_n=R(X_1,...,X_n).$

Conjecture: If $X_1,...,X_n$ are i.i.d. r.v.s with any discrete distribution, then $E[R_n]\lt{2\over 3}$. (It's easy to show that $E[R_n]={2\over 3}$ when the $X_i$ are i.i.d. with any continuous distribution.)

Supposing the $X_i$ are i.i.d. with a discrete distribution having p.m.f. $p()$ and c.d.f. $F()$, we have the following: $$\begin{align*}E[R_n] &={1\over n-2}E\left[ \sum_{i=2}^{n-1}\mathbb{1}_{(X_{i-1}<X_i>X_{i+1}) \text{ or } (X_{i-1}>X_i<X_{i+1})}\right]\\ &=P[(X_1<X_2>X_3) \text{ or } (X_1>X_2<X_3)]\\ &=P[X_1<X_2>X_3] + P[X_1>X_2<X_3]\\ &=\sum_{x: p(x)>0} p(x)\left(P[X_1<x>X_3\mid X_2=x] + P[X_1>x<X_3\mid X_2=x]\right)\\ &=\sum_{x: p(x)>0} p(x)\left(P[X_1<x]^2 + P[X_1>x]^2\right)\\ &=\sum_{x: p(x)>0} p(x)\left(\left(F(x)-p(x)\right)^2 + \left(1-F(x)\right)^2\right)\\ \end{align*}$$

Question: How to show that this must be less than 2/3, if such is the case? Or find a counterexample if not?

NB: Simulations suggest that for a given finite support, a uniform distribution might maximize $E[R_n]$. When the $X_i$ are i.i.d. DiscreteUniform$\{1,...,m\}$, we have $p(x)={1\over m}$ and $F(x)={x\over m}$ for $x=1,...,m$, and the above summation reduces to $$E[R_n]={(2m-1)(m-1)\over 3m^2}\lt{2\over 3}$$ with $E[R_n]$ approaching ${2\over 3}$ from below as $m\to\infty.$

2

There are 2 best solutions below

2
On BEST ANSWER

We can use the linearity of expectation argument in the discrete case as well, we just have to be a little bit more careful. For each $i$ with $1<i<n$, there are three cases:

  1. The values of $X_{i-1}, X_i, X_{i+1}$ are all distinct. Conditioned on this case, there is a $\frac23$ probability that $X_i$ is a turning point: by symmetry, all six orderings of $X_{i-1}, X_i, X_{i+1}$ are equally likely, and four of them lead to a turning point at $X_i$.
  2. Two of the values $X_{i-1}, X_i, X_{i+1}$ are equal, but different from the third. Conditioned on this case, there is a $\frac13$ probability that $X_i$ is a turning point: it is a turning point if and only if it is the "odd one out" of the three, and by symmetry, each of $X_{i-1}, X_i, X_{i+1}$ is equally likely to be the odd one out.
  3. All three of the values $X_{i-1}, X_i, X_{i+1}$ are equal. In this case, it is impossible for $X_i$ to be a turning point.

If the probabilities of the three cases are $p_1, p_2, p_3$ with $p_1 + p_2 + p_3 = 1$, then the total probability that $X_i$ is a turning point is $\frac23p_1 + \frac13p_2$, and the expected number of turning points is $\mathbb E[T_n] = (\frac23 p_1 + \frac13p_2) (n-2)$.

It is clear that $\frac23 p_1 + \frac13 p_2 \le \frac23 p_1 + \frac23 p_2 + \frac23p_3 = \frac23$, which means that $\mathbb E[T_n] \le \frac23(n-2)$.

Moreover, if there is any value $x$ such that $\Pr[X_i = x] > 0$, then $p_3 \ge \Pr[X_i = x]^3 > 0$, and therefore $\frac23 p_1 + \frac13 p_2 < \frac23$; which means that $\mathbb E[T_n] < \frac23(n-2)$. Such a value $x$ will exist, in particular, for all discrete random variables.

2
On

Requested from comments and building on Misha Lavrov's answer:

This uses linearity of expectation on each of the $n-2$ triplets $X_{i-1}, X_i, X_{i+1}$ with $1<i<n$.

  • If $q_3= \sum\limits_x \Pr[X_i=x]^3$
    • then there is probability $q_3$ that all three are the same and there is no turning point at $X_i$
  • If $q_2 = \sum\limits_x \Pr[X_i=x]^2$
    • then there is probability $q_2 - q_3$ that $X_{i-1}= X_i$ and $X_{i+1}$ is distinct and there is no turning point at $X_i$
    • and there is probability $q_2 - q_3$ that $X_{i}= X_{i+1}$ and $X_{i-1}$ is distinct and there is no turning point at $X_i$
    • but there is probability $q_2 - q_3$ that $X_{i-1}= X_{i+1}$ and $X_{i}$ is distinct and there is a turning point at $X_i$
  • Otherwise $X_{i-1}, X_i, X_{i+1}$ are distinct, with probability $1-3q_2+2q_3$
    • and with probability $\frac13(1-3q_2+2q_3)$ that $X_{i-1}< X_i< X_{i+1}$ or $X_{i-1}> X_i> X_{i+1}$ and there is no turning point at $X_i$
    • but with probability $\frac23(1-3q_2+2q_3)$ that $X_{i-1}< X_i> X_{i+1}$ or $X_{i-1}> X_i< X_{i+1}$ and there is a turning point at $X_i$

So the probability of a turning point at $X_i$ is $(q_2 - q_3) + \frac23(1-3q_2+2q_3) = \frac23 - q_2+\frac13 q_3$

and so $\mathbb E[T_n] = \left(\frac23 - q_2+\frac13 q_3\right)(n-2)$ by linearity of expectation, and $\mathbb E[R_n]=\frac23 - q_2+\frac13 q_3$.

Since $q_3 \le q_2$, you have $\mathbb E[R_n]\le \frac23(1 - q_2)$, which is less than $\frac23$ if there are any points of positive probability in the distribution.