Given math model to model relations using a tensor, how do we put restriciton on tensor so that it can capture symmetric relations

121 Views Asked by At

The goal is to model real life relations between stuff and people. Say we have sets $E$, $R$ and functions $h:E \to V$, $t :E \to W$ and $r:R \to U$, where $V,W,U$ are finite dimentional vector spaces with potentially distinct dimentions but all with the same underlying field.

The idea is that $E$ is the set of stuff and people, $R$ is the possible kind of relations between them. For example, thing is $E$ could be John Smith, table, star_wars etc. Thins in $R$ could be "is_fighting_with", "lives_in", "in_love_with" etc. The functions $h,t,r$ are just mapping that let us represent the stuff ans relations. Here relations is a tuple in $(a,r,b)\in E\times R\times E$. So because stuff $a$ could be in the head of the tuple or the tail of the tuple, so we have two maps $h,t$ for that.

We can then tell how strong the relation r is between a and b by using a tensor product with a predefined tensor $\chi$:

$\chi\times_1 h(a)\times_2r(r) \times_3 t(b)$. Here $\times_i$ is the mode_i product.

You see the idea? And the value of the tensor product will give the indication of stronngess of the relation between a and b.

Now some relations are symmetric. For example, "mutualy_in_love","slept_with","as_smart_as". Let's denote the relaiton by "equals_to", well in this case, we want to put restriction on $\chi$ such that $\chi\times_1 h(i)\times_2r(equals-to) \times_3 t(j)=\chi\times_1 h(j)\times_2r(equals-to) \times_3 t(i)$ for all $i,j \in \mathcal{E}$.

SO my question is: what is the restriction we put on $\chi$

1

There are 1 best solutions below

0
On BEST ANSWER

The short answer is that

The restriction you desire is on $\chi$ is that $\chi: \text{Sym}^2(V)^* \times U \longrightarrow Z$; i.e. that $\chi$ be a function on on the symmetric bilinear functions $\text{Sym}^2(V)^* \times U$.

But technically the restriction you desire is on $r$ and not on $\chi$. The idea is the following:

$r$ (or $R$) is itself a tensor, in particular, it is a bilinear tensor and $\chi$ is a trilinear tensor. The relations being symmetric forces the tensor $r$ to be symmetric, $\chi$ only "implicitly inherits its symmetry from $r$."

Intuitive Explanation

If $R$ is the set of relations then $R \subset \mathcal{P}(E \times E)$, where $\mathcal{P}$ is the powerset operator (if $R$ were just one relation we would have $R \subset E \times E$); therefore it makes sense to think of $R$ as having "one more extra dimension than a regular relation." A relation is nothing more than a boolean (i.e. $0-1$) matrix; our set of relations is nothing more than a boolean (i.e. $0-1$) "3-D array." In other words if $r' \in R$ and $e,e' \in E $ and $er'e'$ is true then the most reasonable value to give to (the coordinate) $r(r')_{e,e'}$ is one, likewise if $er'e'$ is false then $r(r')_{e,e'}=0$. If all of the $r' \in R$ are symmetric then $r(r')_{e,e'} = r(r')_{e',e}$ and therefore $\chi $ can defined as any (3-linear) function on "symmetric 3-D arrays" with the dimensions of $r$. Therefore symmetry is a restriction on the 3-D array $r$ and not on $\chi$; as a matter of fact we will eventually dispense with the superfluous function/tensor $\chi$ altogether.

Rigorous Explanation

Let's make this definition more rigorous:

  1. enumerate each $e \in E$ by a number in $\{1,...,|E|\}$,
  2. likewise enumerate each $r' \in R$ by a number in $\{1,...,|R|\}$,
  3. let $v_i$ be a basis for $V$ and $u_i$ be a basis for $U$, where $\dim (V) = |E|$ and $\dim (U) = |R|$
  4. and let $h: E \rightarrow V$ send the element $e_i$ to $v_i$

then we define the bilinear function $r : V \times V \rightarrow U$ (i.e. "3-D" matrix/array) as \begin{equation} r(v_{i} , v_j ) = \sum_{k}r_{i,j}^ku_k \end{equation} where the coefficients are defined by \begin{equation} r_{i,j}^k = \begin{cases} 1 & \text{if } e_i r_k' e_ j \text{ is true}\\ 0 & \text{if } e_i r_k' e_ j \text{ is false} \end{cases}; \end{equation} this is well defined by linearity since a (multi-)linear mapping is uniquely defined by its behavior on a basis, in this case the basis is $b_i \otimes b_j \in V\bigotimes V$, see the universal mapping property for a "category theory" explanation. Technically you also need to know the universal property for direct sums of vector spaces. Intuitively $V$ is the direct sum of the lines $L_i = \{c v_i \ | \ c \in \mathbb{F} \}$ i.e. $V = \bigoplus_i L_i$; and $V \otimes V$ is the space defined by the direct sum of the planes $L_i \bigotimes L_j = \{ c (e_i \otimes e_j) \ | \ c \in \mathbb{F} \}$; i.e. $V \otimes V = \bigoplus_{i,j} L_i \bigotimes L_j$, and $r$ being a symmetric tensor is the same as saying that $r$ does not depend on the orientation of the planes $L_i \otimes L_j$; i.e. $r$ has no "right-hand rule."

The restriction that the relations be symmetric is a restriction on the tensor $r$. It is nothing more than the rule that $r(b_{i} , b_j )= r(b_{j} , b_i )$ for all $i,j$; i.e. $r$ is a symmetric bilinear tensor. We express by saying that $r \in \text{Sym}^2(V)^* \times U$.

You are now free to define $\chi: \text{Sym}^2(V)^* \times U \longrightarrow Z$ (where $X^{*}$ is the dual space of $X$ and $Z$ is some other vector space) however you please; in particular, the succinct notation $\chi: \text{Sym}^2(V)^* \times U \longrightarrow Z$ expresses the fact that $\chi$ is a function on the "set of symmetric relations" $\text{Sym}^2(V)^* \times U$.

A Generalization Beyond 0-1/true-false

It seems the reason you want to define some extra parameter $\chi$ is that you would like to like to generalize the relations $r \in R$ to have values other than $\{0,1\}$. This can be done directly with $r$ as well, there is no reason to introduce a superfluous $\chi$. Let me elaborate; in the definition of $r^k_{i,j}$ we could have instead written \begin{equation} r_{i,j}^k = \alpha \ \ \ \ \text{ if the relation } e_i r_k 'e_ j \text{ is $\alpha$ strong} \end{equation} where $\alpha$ is a scaler, i.e. $\alpha \in \mathbb{F}$, that measures the strength of the relation $e_i r_k' e_ j$. Nothing else in the definition needs to be modified since $r$ was already defined to be a bilinear symetric tensor. The $\chi$ you are looking for is the $\alpha $ above.

Computer Science Explanation

Just in case you may be asking "But Pedro I really like the Function $\chi$ and I don't care for all of this abstraction, can you do it with the $\chi$?" the answer is no. The rationale is that introducing the parameter $\chi$ into "I don't know, say a machine-learning algorithm" would increase the complexity of the algorithm by a large linear factor. The rationale is the following: if you have a 4-D array $\chi$ that is a function of $\left(V^* \right)^{2} \times U$ (or $E^2 \times R$) then you will have four nested for-loops, as opposed to the three nested for-loops for the simpler $r^k_{i,j}$. Dropping the $\chi$ altogether has practical computational implications. As a matter of fact, we can take this further; it is a well-known fact that although the general vector space, $\left(V^{*}\right)^2$, of bilinear forms $T : V^{2} \rightarrow \mathbb{F} $ has dimension $[\dim (V)]^2$ we have that the vector space, $\text{Sym}^2\left(V\right)^*$, of symmetric bilinear forms $T : V^{2} \rightarrow \mathbb{F} $ has dimension $\binom{\dim (V)}{2}$, which further cuts the number of operations by more than half.