Region in a unit simplex where two events are independent

153 Views Asked by At

Suppose a probability distribution that includes events $A, B$ is represented by a point in a unit 3D simplex (tetrahedron) where 4 corners are probability of 4 “elementary” events as shown in the figure: enter image description here

EDIT:

  1. This is from a probabilistic logic book: $a$ is just “event $A$ happens.”

  2. For clarification, the coordinate in question is similar to the barycentric coordinate. Each point inside a tetrahedron is a linear combination of four vectors representing the tetrahedron’s vertices under the condition that the coefficients sum up to 1. These four coefficients are taken to be the values of $P(a \wedge b), P(\neg a \wedge b), P(a \wedge \neg b), P(\neg a \wedge \neg b)$ respectively (so they should sum up to 1 as required.)

The book says that the green curve connecting two opposite sides of the simplex is where $A$ and $B$ are independent but I don’t understand why. If I let $u,v,w$ be probabilities of $a \wedge b, \neg a \wedge b, a \wedge \neg b$, then the problem is probably reduced to a simple geometry one: find the region of $u=(u+v)(u+w)$ in the barycentric coordinate (since $P(a \wedge b)=P(a)P(b)$), but I know nothing of geometry so any help would be appreciated :)

1

There are 1 best solutions below

1
On BEST ANSWER

I would not break the symmetry of the setup, and instead go with $4$ (barycentric) coordinates. Say

$$t:=P(a\wedge b)\qquad u:=P(\neg a\wedge b)\qquad v:=P(a\wedge\neg b)\qquad w:=P(\neg a\wedge\neg b)\\ P(a)=t+v\qquad P(\neg a)=u+w\qquad P(b)=t+u\qquad P(\neg b)=v+w\\ 0\le t,u,v,w\le 1\qquad t+u+v+w=1$$

Now express the independence as you suggested:

$$t=P(a\wedge b)=P(a)P(b)=(t+v)(t+u)$$

But this is not a homogeneous equation, i.e. it doesn't have the same total degree in each monomial. Better multiply the left hand side by $t+u+v+w$ (which is equal to $1$) to get

\begin{align*} t(t+u+v+w)&=(t+v)(t+u)\\ t^2+tu+tv+tw&=t^2+tu+tv+uv\\ tw&=uv \end{align*}

Using similar computations you get

\begin{align*} u(t+u+v+w)&=(u+w)(t+u) & uv &= tw \\ v(t+u+v+w)&=(t+v)(v+w) & uv &= tw \\ w(t+u+v+w)&=(u+w)(v+w) & tw &= uv \end{align*}

so it's all the same single equation of degree $2$.

As Ethan Bolker already wrote in a comment, the surface this equation describes is a hyperbolic paraboloid. Since it is a doubly ruled surface, it has two straight lines in every point, which is kind of hinted at by the green lines in your diagram.

You can easily construct a model of this using string for the lines. I once saw a model which had four sides of a cube as outer structure, using every other corner of the cube as a corner of the tetrahedron. You could fold it flat while keeping the tension of the strings so they don't tangle. The four tetrahedron edges that coincide with lines of this surface would be diagonal of the side faces, while the remaining two edges would have been diagonals of the omitted top and bottom faces of the cube.

Actually let's do this cube based approach with coordinates, using the $[-1,1]^3$ cube, to get a “proper” 3d coordinate equation. I'll use the $z$ axis as vertical, i.e. the faces $z=\pm1$ would not share a line with the surface.

$$ \begin{pmatrix}x\\y\\y\end{pmatrix}:= t\begin{pmatrix}1\\1\\1\end{pmatrix}+ u\begin{pmatrix}1\\-1\\-1\end{pmatrix}+ v\begin{pmatrix}-1\\1\\-1\end{pmatrix}+ w\begin{pmatrix}-1\\-1\\1\end{pmatrix}= \begin{pmatrix}t+u-v-w\\t-u+v-w\\t-u-v+w\end{pmatrix} $$

Keeping in mind that $t+u+v+w=1$ you can use a matrix inverse to reverse this computation.

$$\begin{pmatrix} 1&1&-1&-1\\ 1&-1&1&-1\\ 1&-1&-1&1\\ 1&1&1&1 \end{pmatrix} \begin{pmatrix}t\\u\\v\\w\end{pmatrix}= \begin{pmatrix}x\\y\\z\\1\end{pmatrix} \\ \begin{pmatrix}t\\u\\v\\w\end{pmatrix}= \frac14\begin{pmatrix} 1&1&1&1\\ 1&-1&-1&1\\ -1&1&-1&1\\ -1&-1&1&1 \end{pmatrix} \begin{pmatrix}x\\y\\z\\1\end{pmatrix}= \frac14 \begin{pmatrix}x+y+z+1\\x-y-z+1\\-x+y-z+1\\-x-y+z+1\end{pmatrix} $$

When you plug this into the equation of the surface, the factor of $\tfrac14$ everywhere will cancel out, so I'll omit that straight away.

\begin{align*} tw&=uv\\ (x+y+z+1)(-x-y+z+1)&=(x-y-z+1)(-x+y-z+1)\\ -x^2 - 2 x y - y^2 + z^2 + 2 z + 1 &= -x^2 + 2 x y - y^2 + z^2 - 2 z + 1 \\ 4z &= 4xy \\ z &= xy \end{align*}

It is easy to see that if you fix e.g. the $y$ coordinate, then the $z$ coordinate will be linear in $x$. In other words, the surface $z=xy$ intersects a plane of fixed $y$ coordinate in a line from $(-1,y,-y)$ to $(1,y,y)$. Likewise for the other direction you get lines from $(x,-1,-x)$ to $(x,1,x)$. This is the double ruling, these lines are the strings of the physical model.