Positiveness of some functions, connection with the central limit theorem and stable distributions

183 Views Asked by At

Final update on 11/28/2019: I have worked on this a bit more, and wrote an article summarizing all the main findings. You can read it here.

Let us consider the following function:

$$f(x) = \frac{1}{2\pi}\int_{-\infty}^{\infty} \cos(xt)\cdot\exp\Big(-a^2(b-|\sin(ct)|^d)\cdot t^2\Big)dt.$$

Here $-\infty < x < \infty, a\geq 1, b=4, c=1$ and $d=1$ or $d=2$. The function $f(x)$ is a symmetric density centered at zero, it integrates to one, all its odd moments are zero, and all its even moments exist and are positive. Indeed, this is the density of a random variable $X$ with the following characteristic function: $$\psi_X(t) = \exp\Big(-a^2(b-|\sin(ct)|^d)\cdot t^2\Big)\Big.$$

Most importantly, it is NOT the density of a Gaussian distribution (unless $c=0$ or $d=0$) and its variance is finite. The big question is this: is it really a density, that is, is the characteristic function a valid one? The one thing that needs to be confirmed is whether $f(x)\geq 0$ everywhere. In the cases that I investigated, the answer seems to be positive, but the minimum value of $f(x)$ on any finite interval is so close to zero that it is impossible to conclude. It certainly looks like $f(x) > -10^{-16}$ everywhere but unfortunately this is too close to zero to be confirmed by numerical computations as the precision in my algorithms is about 15 digits. WolframAlpha is also unable to answer this question.

Below is the chart for $f(x)$, with $a=1, b=4, c=1, d=2$.

enter image description here

My computations tell me that $f(-39.71) \approx -2.94 \times 10^{-17}$ yields the absolute minimum, while $f(39.71) \approx -1.38 \times 10^{-17}$. This is beyond the precision offered by the programming language, and anyway $f(-39.71) = f(39.71)$. WoframAlpha returns $f(-39.71) = f(39.71) = 0$ (an absolute $0$), see the computation here.

By contrast, if $a=1, b=2, c=1, d=2$, then the minimum is $-0.000003388$ and it is clearly negative and confirmed by WolframAlpha: it is attained at $x\approx \pm 13.56$. The case $a=1, b=4, c=1, d=1$ is even more challenging, with $f(x)$ looking perfectly strictly positive everywhere. See also my related question posted on CrossValidated, here.

Connection with CLT and Stable Distributions

If any of these functions is positive (say if $a\geq 1, b=4, c =1, d=2$) then we are dealing with a stable family of true densities governed (in this example) by one parameter: $a\geq 1$. There are two consequences to this, unless something is wrong in my reasoning:

  • It invalidates the classical theory of stable distributions, stating that the only stable family with a finite variance is the Gaussian family (see the book Limit Distributions for Sums of Independent Random Variables, by Gnedenko and Kolmogorov, published in 1954; the whole purpose of this book is proving this very fact.)
  • It also potentially invalidates the central limit theorem (CLT): If $X_1, X_2$ are i.i.d. with a distribution from that family, the same is true for $X_1 + X_2$, and indeed for $\lim_{n \rightarrow \infty} (X_1+\cdots +X_n)/\sqrt{n}$. Note that $E(X_i)=0$. Thus, the convergence in distribution is towards a distribution from that same family, which does NOT include the Gaussian law. The only $X_i$'s known to violate the CLT have an infinite variance, for instance the Cauchy distribution which also constitutes a stable family. Yet in this case the variance is finite.

Question

Thus my question is this: is it true that $f(x) \geq 0$ everywhere, at least depending on the parameters, and excluding the Gaussian case. What about the stability of the family of distributions introduced here (it is fully stable under addition / multiplication by a constant?)

Update 2

I just computed the density in question in the case $c=0$. This corresponds to a Gaussian distribution, thus $f(x)$ is definitely strictly positive in this case. Yet my program returns the global minimum as being below zero, about $-4 \times 10^{-17}$. This suggests that negative values (of similar magnitude) obtained in the case $a=1, b=4, c=1, d=2$ are just an artifact of machine precision. This boosts my confidence in the fact that we are also dealing with a proper density in this latter case. But it is not a proof of course, and I am still a little skeptical.

For those interested, I am now looking at some nasty distribution, something defined by a CF like

$$\psi_X(t) = \exp\Big(-a^2 |t|^{2+\sin(1/|bt|)}\Big).$$

This density looks very smooth yet is really nasty in some sense. Let's call it $H(a, b)$ as it is governed by two parameters $a, b$. It integrates to 1, but... it is not a density! The minimum is very slightly below zero, around $-0.02$. I am somewhat confident that I will find one within the next 10 days, with the same nastiness, that is a proper density.

Here is a proposed generalization. The $H(a, b)$ distribution (if it was actually a distribution) is semi-stable in the following sense: stable both under addition and multiplication by a scalar, separately but not jointly stable. What it means is this:

  • If $X,Y$ are independent and are $H(a_1, b), H(a_2,b)$ respectively, then $X+Y$ and $X-Y$ is $H(\sqrt{a_1^2+a_2^2},b)$.
  • If $X$ is $H(a,b)$ and $r>0$, then $rX$ is $H(ar, br)$.

As a result, $Z=(X_1 + \cdots + X_n)/n$ is $H(a, b/\sqrt{n})$. A general class of 2-parameter semi-stable, symmetric distributions centered at zero (much larger than the class of symmetric stable distributions centered at zero) is defined by the following characteristic function:

$$\psi_X(t) =\exp\Big[-a^2\Big(p(b\cdot|t|)+q(b\cdot|t|)\Big) \Big] .$$

Here $p,q$ are two real-valued functions chosen so that $\psi_X$ is a proper characteristic function, and $b>0$. For instance $p(t) = t$ and $q(t) = t^2$. If you use the product $p(b\cdot|t|)\times q(b\cdot|t|)$ rather than the sum $p(b\cdot|t|) + q(b\cdot|t|)$, it also works.

3

There are 3 best solutions below

1
On BEST ANSWER

The tone of this post is alarmist, and I would recommend changing it. Even if we assume that this is a density function, the CLT will still hold.

Let $X$ be of the form you are describing, and set $S_N = N^{-1/2}\sum_{j = 1}^N X_j$. As you note, the characteristic function of $X$ is $$\psi_X(t) = \exp\left(-a^2 \left( b - |\sin(ct)|^d\right)t^2\right)\,.$$

Thus, the characteristic function of $S_N$ is $$\psi_{S_N}(t) = \exp\left(-N a^2 \left( b - |\sin(ct/\sqrt{N})|^d\right)(t/\sqrt{N})^2\right) \to \exp \left(-a^2 b t^2 \right)\,.$$

6
On

So you conjecture that $\psi_X(t)$ is the characteristic function of a symmetric stable distribution, so that for each $r_1,r_2\in\mathbb R$ there exists $r\in\mathbb R$ so that $\psi_X(r_1t)\psi_X(r_2t)=\psi_X(rt)$ for all real $t$?

In which case we would have, for all real $t$, $$ a^2 (b-|\sin(cr_1t)|^d)(r_1t)^2 + a^2 (b-|\sin(cr_2t)|^d)(r_2t)^2 = a^2 (b-|\sin(crt)|^d)(rt)^2 $$ or (after performing the obvious cancellations) $$ (b-|\sin(cr_1t)|^d)r_1^2) + (b-|\sin(cr_2t)|^d) = (b-|\sin(crt)|^d)r^2,$$ for all real $t\ne0$. This, in turn, implies $$ |\sin(cr_1t)|^dr_1^2 + |\sin(cr_2t)|^d r_2^2= |\sin(ct)|^dr^2,$$ which cannot hold for all $t\ne0$.

So even if your function $f(x)$ is a proper probability density function, I don't think it is a counterexample to the theory of stable distributions and to the central limit theorem.

4
On

The following 2-parameter family of symmetric distributions centered at zero, defined by its characteristic function below, is semi-stable, that is stable separately both for the addition and multiplication by a constant, assuming the densities are proper.

$$\psi_X(t) = \exp(-a^2(4-|\sin (ct)|^\alpha t^2).$$

The two parameters are $a \geq 1$ and $c$, while $\alpha > 0$ is fixed. A distribution from this family is denoted as $G(a,c)$. If $X$ and $Y$ are independent with distribution $G(a_1, c)$ and $G(a_2, c)$ respectively, then both $X+Y$ and $X-Y$ are $G(\sqrt{a_1^2+b_1^2}, c)$. If $X$ is $G(a, c)$,then the distribution of $rX$ where $r$ is a constant, is $G(ra, rc)$. Thus if the $X_i$'s are i.i.d $G(a, c)$, it follows from this that the distribution of $(X_1 + \cdots + X_n)$ is $G(a\sqrt{n},c)$, and the distribution of $Z=(X_1 + \cdots + X_n)/\sqrt{n}$ is $G(a,c/\sqrt{n})$.

Thus the distribution of $Z$ belongs to that same family. Note that $G(a, 0)$ is a normal distribution. As $c/\sqrt{n} \rightarrow 0$, the limit distribution associated with the CLT is also normal, and belong to the same stable family. There is no violation of the CLT in this case.

A related question

If $X_1, X_2, X_3$ and so one are i.i.d. $G(a,c)$, then do the successive differences (defined below) also belong to that same family?

  • $Y_1 = X_1 - X_2$
  • $Y_2 = X_1 - 2X_2 + X_3$
  • $Y_3 = X_1 - 3X_2 + 3 X_3 - X_4$
  • $Y_4 = X_1 - 4X_2 + 6 X_3 - 4X_4 + X_5$

Let's define $$Z = \lim_{n\rightarrow \infty} \frac{Y_n}{\sqrt{\mbox{Var}(Y_n)}}=\lim_{n\rightarrow \infty} \frac{n! Y_n}{\sqrt{(2n)!\mbox{Var}(X_1)}}.$$

Does $Z$ belong to the same family again? This is true if the family $G(a,c)$ is stable, but maybe not if the family is semi-stable.