formal definition statistics model

341 Views Asked by At

i can't find rigorous definition about statistic model. Many authors use the definition below

let suppose to have a probability space

$$(\Omega,\mathcal{F},\mathbb{P})$$

a statistic model is

$$( \{X_i \}_{i \in I_n}, F, \Theta)$$

where $\{X_i \}_{i \in I_n}$ are r.v, $F$ is the set of densities and $\Theta$ is the parametric space.

unfortunately nobody report the domain of r.v., imho all $X_i$ are defined in the same domain $\Omega$ because this would clarify the statement below if $X_i$ are iid with densities $f_i(x_i)$

$$\mathbb{P}(\bigcap_{k=1}^{n}X_{k}^{\leftarrow}(A_k))= \int_{A_1 \times...\times A_n}\prod_{k=1}^{n}f_k(x_k)dx_1...dx_n$$

another perplexity is about the definition of sample random vector $(X_1,...,X_n)$, is:

$$X(\omega_1,...,\omega_n)=(X_1(\omega_1),...,X_n(\omega_n))$$ or what else?

could someone explain me how to formalize all?


@angryavian

my problem is about using rigorously the measure theory in statistic. I give you an example to see if i understand: i would to build a statistic model starting with a probability space $(\Omega,\mathcal{F},\mathbb{P})$ and a single r.v $X: \Omega->\mathbb{R}$ with $f(x|\theta)$ density, and extend it like sequence of coin flip(for simplicity take $n=2$).

i can define on product space $(\Omega^2,\mathcal{F}\otimes\mathcal{F},\mathbb{P}\times\mathbb{P})$ the r.v $X_k(\omega_1,\omega_2)=X(\omega_k), k=1,2$

so using $\mathbb{P}\times\mathbb{P}(A)= \int_{\Omega} \int_{\Omega}\mathbb{1}_{A}(\omega_1,\omega_2)d\mathbb{P}d\mathbb{P}$

i can say that $$\mathbb{P}\times\mathbb{P}((X_1,X_2)^{\leftarrow}(A_1\times A_2))= \int_{\Omega} \int_{\Omega}\mathbb{1}_{A_1\times A_2}(X(\omega_1),X(\omega_2))d\mathbb{P}d\mathbb{P}= \\ \int_{\Omega} \int_{\Omega}\mathbb{1}_{A_1\times A_2}(X(\omega_1),X(\omega_2))d\mathbb{P}d\mathbb{P}=\int_{\Omega}\mathbb{1}_{A_1}(X(\omega_1))d\mathbb{P}*\int_{\Omega}\mathbb{1}_{A_2}(X(\omega_2))d\mathbb{P}$$

i use $\mathbb{1}_{A \times B}(x,y) = \mathbb{1}_A(x) *\mathbb{1}_B(y)$

now $\int_{\Omega}\mathbb{1}_{A_1}(X(\omega_1))d\mathbb{P}=\int_{\Omega} \int_{\Omega}\mathbb{1}_{A_1}(X(\omega_1))d\mathbb{P} d\mathbb{P}= \mathbb{P}\times\mathbb{P}(X_1^{\leftarrow}(A_1)) \\$ the same procedure used for$ X_2$ gives indipendence

in conclusion i can say that my statistic model could be $$\mathbb{P}_{(X_1,X_2)}: \theta \rightarrow \int_{S}f(x_1|\theta)f(x_2|\theta)dx_1 dx_2$$

in fact the distributions of $X_1,X_2$ are the same of $X$ because $$\mathbb{P}\times\mathbb{P}(X_1^{\leftarrow}(A))= \int_{\Omega}\mathbb{1}_A \circ X d\mathbb{P}=\int_A f(x|\theta)dx$$ $\\$ i hope i have expressed all correctly

1

There are 1 best solutions below

1
On

From our discussion in the comments, your question seems to be more about how to formulate the notion of independent random variables in measure-theoretic terms, and not about statistical models.


In general, any joint distribution of random variables $(X_1, \ldots, X_n)$ can be expressed as maps $X_i : \Omega \to \mathbb{R}$ on a common probability space $(\Omega, \mathcal{F}, P)$. The nature of the distribution will depend not only on the maps $X_i$, but also the probability space $(\Omega, \mathcal{F}, P)$. At this generality you can always write $$P(X_1 \in A_1, \ldots, X_n \in A_n) = P(\{\omega \in \Omega : X_i(\omega) \in A_i\, \forall i\}).$$


In cases where the $X_1, \ldots, X_n$ are independent, the probability space has a special structure that can be decomposed in the following form.

$$\Omega = \Omega_1 \times \cdots \times \Omega_n$$ $$\mathcal{F} = \mathcal{F}_1 \otimes \cdots \otimes \mathcal{F}_n$$ $$P = P_1 \times \cdots \times P_n \tag{product measure}$$ $$X_i((\omega_1, \ldots, \omega_n)) \text{ is a function that depends only on $\omega_i$}$$

That is, the space $\Omega$ can be written as the Cartesian product of spaces $\Omega_i$, and is equipped with the product sigma-algebra and the product measure built up from the smaller spaces' respective sigma-algebras and measures (see Wikipedia for some discussion about how these are defined, which requires some technical machinery).

Using this decomposition, and overloading the notation $X_i$ to mean both a map $\Omega \to \mathbb{R}$ as well as a map $\Omega_i \to \mathbb{R}$ (since $X((\omega_1, \ldots, \omega_n))$ only depends on $\omega_i$), you can then write

$$P(X_1 \in A_1, \ldots, X_n \in A_n) = \prod_{i=1}^n P(\{\omega_i \in \Omega_i : X_i(\omega_i) \in A_i\})$$


Thinking about these things is purely a measure-theoretic issue, and as far as I can tell will not give you any more deeper understanding of statistical models, where specifying the joint distribution $(X_1, \ldots, X_n)$ is enough.