Sometimes, optimization for $X+Y$ gives no expected increase in $Y$. When does this happen?

150 Views Asked by At

Suppose we have two independent mean-zero random variables $X,Y$. Consider the distribution of $Y$ conditional on the outcome $X+Y\ge t$, for some $t$. As $t$ approaches infinity, we can ask about the limit of these conditional distributions; in particular, I'm curious about the cases where

$\lim_{t\to\infty}\mathbb{E}[Y|X+Y\ge t] = 0$

If $X$ and $Y$ are both normal distributions, for instance, this will not happen; the expected value of $Y$ will increase linearly with $t$. However, if $X$ is distributed like $e^{-x^{0.5}}$ (for large $x$) and $Y$ like $e^{-x^{0.9}}$, then the conditional distribution for large $t$ will look very close to that of $Y$ itself, and the effect on $Y$'s expected value will go to nothing.

I think there's a sort of threshold effect here around tails above or below $e^{-x}$; in particular, I believe the expected $Y$ value goes to zero when $X$ is heavy-tailed (in the sense of having subexponential right tails) and $Y$ is lighter-tailed than $X$, but the exact formal conditions necessary are eluding me and I could be mistaken.

I'd be interested in seeing a proof of this behavior under relatively relaxed conditions on $X$ and $Y$; for instance, I have a proof with some messiness if $X$ is fat-tailed (bounded below by a power law) and $Y$ has tails that are less than $X$ by a superlinear factor (along with some niceness conditions on the distributions), but I believe the result should extend to a larger class of distributions where $X$ and $Y$ can be closer together. I'm specifically interested in how much lighter-tailed $Y$ has to be - for instance, I don't think it suffices to have the PDF of $Y$ be merely a constant factor less than that of $X$ in the limit, but perhaps not much more than that is needed?

Pointers to related literature on problems like this would also be welcome!

2

There are 2 best solutions below

0
On BEST ANSWER

I ended up proving a statement like this with a friend. We showed the following two theorems:

No expected increase: When $X$ is subexponential and $Y$ has tails that are at least a factor of $x^{1+\epsilon}$ lighter than $X$, then $\lim_{t\to\infty}\mathbb{E}[Y|X+Y\ge t] = 0$.

Infinite expected value: When $X$ has an arbitrarily light tail in the sense that $\lim_{x\to\infty}\frac{\bar F_X(x+1)}{\bar F_X(x)}=0$, then $\lim_{t\to\infty}\mathbb{E}[Y|X+Y\ge t] = \infty$ for any unbounded $Y$.

Proofs are at this post; the proof of the first statement is a fairly lengthy analysis of the relevant integrals by placing appropriate bounds on the integrals within each of four regions.

1
On

The math is annoying, but the concept is simple, so let’s try to simplify it until we get a reasonable hypothesis

First, the idea is just that for large $X+Y$, the weight will shift either towards $X$ or $Y$. We can condition exactly one $X+Y=t$. This gives a distribution (up to a constant) on X of $p_X(x)p_Y(t-x)$. This is again hard to analyze, but there’s two main options here, either this is constant or it has some mode where most of the probability shifts towards. We can try to estimate this mode and see how $t$ affects it rather than the mean.

Note that if it were constant, then the log would be constant, so taking the derivative of that gives: $p_X’(x)/p_X(x)=p_Y’(t-x)/p_Y(t-x)$

For this to be true, the right hand side (a function of $x$) and the right hand side (a function of $y=t-x$) have to be the same, so the must be constant. This gives $p_X’(x)/p_X(x)=-C$, so $p_X(x)=De^{-Cx}$ is an exponential variable and must have the same distribution as Y. This gives then that the joint distribution is constant, so the mean value of $X$ is $T/2$ would increase linearly.

In other reasonable cases, we’d expect the distribution to be non constant, so it’s reasonable to look at its mode to approximate the mean.

Taking the derivative, we want to find $x$ such that $p_X’(x)/p_X(x)=p_Y’(t-x)/p_Y(t-x)$. If $X$ and Y are normal distributions (and ignoring constants which will be unimportant as $t$ goes to infinity), you’ll get something like:

$ x/\sigma_X^2= (t-x)/\sigma_Y^2$, so $x=(\sigma_X^2/(\sigma_X^2+\sigma_Y^2))t$, a linear increase.

I’m the more general case, if $p_X(x)=e^{-f(x)}$ and $p_Y(y)=e^{-g(x)}$, then we’ll need to solve: $$f’(x)=g’(t-x)$$ If we take $x$ here as a function of $t$, then taking the derivative w.r.t. t gives: $$x’f’’(x)=(1-x’)g’’(t-x)$$

Letting $y=t-x$ and noting that as $t$ gets large, $x’$ will effectively feed into $x$, we would have that $x’/(1-x’)=x/y$, so we could get something like:

$$ x f’’(x)=yg’’(y)$$

Let’s define the above quantity as $z= x f’’(x)=yg’’(y)$.

From this we can see that $x$ and $y$ in a sense act independently, both acting as a function of $z$. Therefore, it suffices to compare the asymptotic behavior of the inverse of $xf’’(x)$ with the inverse of $yg’’(y)$ to get the resulting behavior.

If X is normal with standard deviation $\sigma_X$ then $(xf’’)^{-1}(z)=\sigma_X^2 z$ is linear.

If $X$ has a fatter tail than the exponential then there’s a bit of an odd behavior. As you noticed, as $X$ becomes bigger, the fat tail distribution actually becomes marginally better at taking larger values so while the large tailed value is increasing, the normal variable would actually be decreasing. For example if we have X follow $e^{-x^0.9}$ then this becomes $-.09x^-1.1=z$ which is negative, so will be really small negative as the normal distribution is really near 0 and this is really near T and $z is small negative.

When both distributions have fat tails like this, then a they both are competing for values and we can go back to this method to figure out the asymptotic relationship between the variables. A constant ratio would give a linear relationship and anything more what cause something like a power law depending on the specifics.