What is the minimal formal structure required to define a stochastic matrix

81 Views Asked by At

A stochastic matrix is often defined as a square matrix $[S_{ij}]$, such that the rows (or columns, depending on the convention) sum to 1. The rationale behind this definition is to think about each row as a probability vector, in which case the unit sum is a very natural condition. This definition can be generalized in the basis free language of vector spaces and linear maps in the following way: Let $V$ be an $\mathbb{R}$-vectorspace and $\dim(V)=n$. Choose $\Omega\in V$, a linear map $S:V\to V$ is stochastic iff $S(\Omega)=\Omega$. This essentially defines a family of bases for $V$ in which $\Omega$ can be represented as the vector [1,1,...1]^T, and $S$ as a stochastic matrix. The structure needed for this formalism is then a vector space $V$ and a choice of $\Omega\in V$

There is often a further condition required of a stochastic matrix however, namely that each entry is non-negative (i.e. $s_{ij}\geq 0$ $\forall i,j\in\{1,...n\}$). This condition is much stronger than the above, and requires more structure to formalize. My question is the following: What is the minimal additional structure needed to add this condition to the definition above?

Clearly a choice of specific basis would do it, but this seems like too much, since if positivity in each entry is satisfied in one basis then there is a whole class of other bases that would also satisfy it (namely those related by transition maps that are also positive in each entry in this basis). Is this pointing at some weaker constraint? or is the question ill posed.

1

There are 1 best solutions below

0
On BEST ANSWER

Stochasticity just isn't really a linear algebra concept, it's a probability concept. The point of the conditions is that they're equivalent to requiring that the matrix preserve the probability simplex

$$\{ (p_1, \dots p_n) \in \mathbb{R}^n : p_i \ge 0, \sum_{i=1}^n p_i = 1 \}$$

which is to say that the matrix sends probability distributions to probability distributions. "Probability distributions" are a specific subset of $\mathbb{R}^n$ (really $\mathbb{R}^n$ on the nose, not a generic real vector space) that we care about because we want to compute and work with probabilities (that is, specific coordinates of a vector in $\mathbb{R}^n$ on the nose, again not a generic real vector space).

We can ask: what is the set of linear automorphisms of the probability simplex? And the answer is the permutation matrices (e.g. because any linear automorphism must preserve the extremal points). So we don't get any more symmetry than permuting the coordinates; we aren't dealing with a bare vector space, we are really dealing with the vector space of functions on an $n$-element set.