I cannot understand how this matrix works or how it is defined

125 Views Asked by At

I'm currently reading Ranking a Stream of News and have trouble on page 100 (don't be afraid, the math starts at 99). I cannot understand a matrix they define.

In this article, the authors are providing a mathematical representation -as a graph- of a stream of news consisting of articles and sources. This paragraph should, if I'm understanding correctly, be about how to determine how important edges are. An edge should be more important, for example, when it links an article to a authoritative source or when it links two articles with high similarity to each other.

In the paragraph before, they introduced $G_\omega$ as a stream of news, generally a graph consisting of a set $N$ with articles $n_1, ...$ and a set $S$ with sources $s_1, ...$ (both nodes) and edges sets $E_1$ (source - article) and $E_2$ (article - article).

Let A be the (weighted) adjacency matrix associated with $G_\omega$. We can attribute an identifier to the nodes in $G_\omega$ so that any source precedes the pieces of news. We define the matrix $$A = \begin{bmatrix}O & B\\B^T & \Sigma\end{bmatrix},$$ where $B$ refers to edges from sources to news articles, and $b_{ij} = 1$ if the source $s_i$ emitted article $n_j$ and is the similarity matrix. Assuming one can learn similarity of sources, the matrix $A$ can be modified in the upper-left corner incorporating a submatrix taking into account a source-source information.

I'm not able to understand this paragraph. This may partially be a lack of English skills which makes it hard to read for me, but I got problems with the math as well. Here are my questions:

  • What is $A$? What is its concrete meaning?

  • What is $O$? With "$A$ can be modified in the upper-left corner", do they mean they will change $O$?

  • What "data type" is $B$? I'd think this is a set, "edges from sources to news articles", thus $E_1$. However, they're calculating $B^T$ - what does that mean, if $B$ is a set of edges?

  • What is $T$? I don't see it anywhere else. Is it some constant like $i$ that I don't know of?

  • What is $\Sigma$? I know the sum function, but don't see how it's relevant here. I only see that in the paragraph before this one they define $\sigma_{ij}$ as the continous similarity between article $n_i$ and article $n_j$. Would $\Sigma$ be the set of all $\sigma$, just like $N$ is the set of all $n$ (in this article)?

Nota Bene: I am not very familiar with matrices or set theory. Actually, I'm on the level of a just-finished high school student. I'd like some layman explanation, please.

1

There are 1 best solutions below

0
On BEST ANSWER

I discussed this on the Electrical Engineering chat (permalink) and ThePhoton explained it to me.

$O$ is the zero matrix. This matrix contains only zeros.

$B$ is the matrix where in one direction (horizontal) the sources $s \in S$ are listed and in the other direction (vertical) the articles $n \in N$. This matrix shows which source produced an article.

$B^T$ is the transpose of $B$ (wrapped over the diagonal).

$\Sigma$ is the similarity matrix. It tells you how much the articles are like each other.

Now the meaning of

Assuming one can learn similarity of sources, the matrix $A$ can be modified in the upper-left corner incorporating a submatrix taking into account a source-source information.

We saw that $\Sigma$ was the similarity matrix. It shows similarity between articles. We could also calculate similarity of sources. 'By default', this is disabled: in the upper left corner of $A$, a zero matrix is put. Now assuming you can calculate similarity of sources, you could replace $O$ with $\Sigma_s$, which would be a matrix determining similarity of sources.

Now we can see that $A$ is a matrix containing both the relations between sources and articles (which articles were produced by which sources), and the relations between different articles (how similar they are), and, if $O$ is replaced, the relations between different sources (how similar they are).