TL; DR
Let $e,v$ be two real numbers.
- The constant distribution $f \equiv e$ is universal in the sense that it has the most entropy among those that have mean being $e$. We define variance using this fact.
- The normal distribution $N(e,v)$ is universal in the sense that it has the most entropy among those that have mean being $e$ and variance being $v$. We define 2-variance using this fact.
- What's next? What's the name of the study along this vein?
Original post
The study of a random variable $X$ is the study of its underlying distribution $P_X$. However, it's too expensive, so we wish to start simple. To make life even simpler, let's assume all variables are from $\mathbb{R}$ to $\mathbb{R}$.
The simplest yet important information of $X$ is its mean $E(X)$. Once we know its mean, we can ask further questions.. like how much does $X$ differ from the universal distribution among those with mean $E(X)$? Here by universal, I mean the distribution(s) possessing the maximal entropy among those that have the same mean. Of course, it's the constant distribution $f \equiv E(X)$, taking value at $E(X)$.
Moving forward, we wish to see how much $X$ differs from $E(X)$. We of course take their difference $|X - E(X)|$, but decide to consider instead of its square $(X-E(X))^2$.. because why not? It's both easier to analyze and yet equivalent to the difference. Same as when we thought $X$ contains too much information, so does $(X-E(X))^2$. Hence, we again consider its mean $E((X-E(X))^2)$, and coin it to be $Var(X)$.
We wish to move forward, and derive higher versions of variances. I'm aware of usual higher moments $E((X - E(X))^n)$ but that's not what I want. What I want is to keep following the same vein of thought, and consider the difference of $X$ with the universal distribution that has the known attributes.
Henceforth, we consider the difference $|X - N(E(X),Var(X))|$, where $N$ denotes the normal distribution, because the normal distribution is the one that has the most entropy among those that have mean $E(X)$ and variance $Var(X)$. Let's again consider the square with the same reason as above. And so we define the $2$-variance
$$Var_2(X) := E ( (X-N(E(X),Var(X))^2 ),$$
and so on.
Questions
What is the universal distribution with mean $E(X)$, variance $Var(X)$, and 2-variance $Var_2(X)$? Again by universal I mean the one that has the most entropy among those having the same known attributes.
What's the standard name for $Var_2, Var_3, \cdots$?