Let $y:=\frac1n\sum_{i=1}^n x_i$, where $\{x_i\}_{i=1}^n$ is a set of i.i.d. random variables, and every $x_i$ has a lognormal distribution $x_i \sim\text{Lognormal}(\mu,\sigma^2)$. Let $\text{Med}[y]$ be the median of $y$. Is the following inequality true $\forall (n,\mu,\sigma)$? $$\text{Med}[y]<\mathbf E[y]$$
Motivation: I am computing the sample mean of the lognormal random variables via Monte Carlo. The sample mean seems tend to concentrate below the mean for large $\sigma$. I am wondering whether this is true for all cases. It is true for a sigle sample. However, there is no explicit formula for the distribution of the mean of finite number of --- not even two -- samples i.i.d. lognormal variables. I have no idea how to prove it.
Yes, it seems the result holds.
Case: $n = 1$
Here, $Y = X_1$. First note that,
$$Med(X_i) = e^\mu$$ $$E(X_i) = e^\mu e^{\sigma^2/2}$$
Since $\sigma^2 > 0$, we know that $e^{\sigma^2/2} > 1$ which implies that $E(Y) > Med(Y)$.
Case: $n = \infty$
In the limit, it is clear by the central limit theorem that $$y \stackrel{d}{\rightarrow} N(\mu^*, \sigma^*/\sqrt{n})$$
Where $\mu^*$ and $\sigma^*$ are the mean and standard deviation of the $X_i$ respectively. Since this is converging to a normal distribution, in the limit we have $E(Y) = Med(Y)$.
Case: $1 < n < \infty$
This is not at all rigorous, but when $n=1$ we have a positive skew distribution and when $n\rightarrow\infty$ we converge to a symmetric (normal) distribution for $Y$. In between, the sampling distribution of $Y$ will remain positively skewed and hence $E(Y)$ will be greater than $Med(Y)$ for finite $n$.
This plot, which was constructed via simulation illustrates this point informally. The vertical lines represent the median (green) and mean (blue).
A (Partial) More Rigorous Approach
Define $D_n = E(Y_n) - Med(Y_n)$, we wish to show that $\forall \ n, \ \mu, \ \sigma$, it is the case that $D_n \geq 0$. By the extreme cases above, we have that: $$D_1 > 0$$ $$\lim_{n\rightarrow\infty}D_n = 0$$
Thus, if we can show that $D_{n} > D_{n+1}$ the by MCT we can argue that for any finite $n$ it must be the case that $D_n \geq 0$.
In attempt to show that $D_n - D_{n+1} > 0$, we can write:
$$D_n - D_{n+1} = \left[E(Y_n) - E(Y_{n+1}\right] + \left[Med(Y_{n+1}) - Med(Y_n)\right]$$
Since $Y$ is the sample mean, the expected value will be the same as $X_1$, regardless of $n$ (as seen in the simulation above). Hence we need only show that:
$$Med(Y_{n+1}) > Med(Y_n)$$
One way we can accomplish this, is by showing that
$$P(Y_{n+1} < M_n) < \frac{1}{2}$$
where $M_n = Med(Y_n)$. We can write this as:
$$\begin{align*} P(Y_{n+1} < M_n) &= P\left(\frac{n}{n+1}Y_n + \frac{x_{n+1}}{n+1} < M_n\right) \\ &= P\left(Y_n + \frac{X_{n+1} - Y_n}{n+1} < M_n\right) \end{align*}$$
Informally, it seems that if $\frac{X_{n+1} - Y_n}{n+1}$ is positive more often than it is negative, the above probability should indeed be less than 1/2.
Perhaps somebody smarter than me can figure out where to go from here. (: