Question about definition of statistical Model in information geometry

64 Views Asked by At

Consider a family $S$ of probability density functions on $X$ which is defined as $p:X\to\mathbb{R}$ such that $p(x\geq 0)$ and $\int_{X}p(x)dx=1$.Suppose each element of $S$ may be parameterized using real-valued variable $[\xi^1,....,\xi^n]$ that is
$S=\{p(x;\xi):\xi=[\xi^1,....,\xi^n]\in E\subseteq \mathbb{R^n}\}$ where $x\in X$ and the mapping $\xi\mapsto p(x;\xi)$ is injective.

My question is why do we need $\xi\mapsto p(x;\xi)$ to be injective??

2

There are 2 best solutions below

0
On BEST ANSWER

I agree with the other answer but want to add that we should be more clear about why this is a necessary assumption that is often made in statistics. The usual definitions are:

  1. A statistical model $\mathcal{P}$ is a collection of probability measures on the sample space $(\mathcal{X}, \mathcal{B})$. The collection of all probability measures is termed the full model, or the full nonparametric model.

  2. A model $\mathcal{P}$ is parameterized with parameter space $\Theta$ if there exists a surjective map $\Theta \to \mathcal{P}: \theta \mapsto P_\theta$, called the parameterization of $\mathcal{P}$.

  3. A parameterization of a statistical model $\mathcal{P}$ is identifiable if the parameterization is injective.

Injectivity means that no two different parameter values give rise to the same distribution.

So the point here is that injectivity is a requirement for identifiability, which is a basic assumption of any statistical model.

The wiki page is a good place to read more about this concept.

0
On

Short answer: My guess is that properties of the manifold will be inferred from properties of $E$ and $\xi \mapsto p(\cdot;\xi)$. In order to establish many topological properties having an injective map is either required or substantially simplifies the arguments.

Long answer

Based on the definition you give of the set $S$, it appears that $p(x;\xi)$ is being treated like the function $p(\cdot; \xi)$. This is a notational choice that some authors make.

The reason I mention this is because that means that $\xi \mapsto p(x;\xi)$ really is a map from $\xi$ to a function. This can be confusing because $p(x;\xi)$ is also a scalar (evaluating the function $p(\cdot; \xi)$ at the point $x$).

With this convention, the injectivity assumption really means that every $\xi$ maps to a unique function $p(\cdot; \xi)$. This is important because when you're working on a parameterized manifold, having a unique representation can makes life much simpler.

For example you might want to understand the manifold in terms of the set $E$ and the map $\xi \mapsto p(\cdot;\xi)$. If $E$ is a nice set (for example compact, convex, or finite-dimensional) and $\xi \mapsto p(\cdot;\xi)$ is well behaved (for example injective, continuous or smooth) then it will be possible to say a lot about the manifold just from the properties of $E$ and $\xi \mapsto p(\cdot;\xi)$. It turns out that being injective is often very important if you want to understand the topology of the manifold in terms of the topology of the set $E$.