What centrality measure can be used to rank important nodes relative to a particular node in a knowledge graph?

84 Views Asked by At

Given a knowledge graph $K$ and a particular node in it called $N$, I want to find out the set of nodes that are important related to the node $N$ only. I started looking up some papers and the documentations for centrality in Networkx 2.4. It seems to me, the different centrality values are calculated on the basis of the entire network. Is there some centrality which return only the important nodes relative to a particular node only?

As an extension to this question, for the same knowledge graph $K$ and node $N$, let us assume the node contains features ${f_1,f_2,f_3}$. Is there any centrality measure that I can use to find the set of nodes/subgraph that are important with respect to a particular feature of the node (say $f_1$)?

1

There are 1 best solutions below

0
On BEST ANSWER

First of all, when doing network analysis, it’s much more important to actually captures your property of interest than to use a measure that already exists. Your first thought should therefore be: “Can I describe the property I am interested in in my own terms? Can I somehow quantify this?” Often there is an easy answer to this. Of course, after this, you should also investigate whether somebody already had a similar or better idea to see what you can learn from this. But please do not frantically search for existing network measures to blindly slap on your network.

That being said, I do not think that centrality is a good keyword here (though centralities may be employed on the way and inspire you). Some ideas that come to mind:

  • The most blatant way to check whether one node is relevant for another is whether there is a link between the nodes (or the weight of the link in a weighted network).

  • More generally you can look at the distance of nodes to your target node in terms of shortest paths (closeness). Depending on the application, it may also make sense to account for multiple paths going to the same target.

  • You could translate concepts from betweenness centrality and count the number of shortest paths to your target node that go through a node.

  • In general, you can aggregate the closeness and the plain centrality of a node, i.e., you say that nodes are important when they are close to your target node and important for the entire network.

  • Translating concepts from eigenvector centrality, you can interpret your network as a flow (e.g., of information) and see where information from your node of interest flows to. Mathematically, simply start with a vector $e_N$ that is $1$ in the component corresponding to your node of interest and $0$ everywhere else. Then have a look at $A·e_N$, $A^2·e_N = A·A·e_N$, and so on, where $A$ is the adjacency (or weight) matrix of your network. ($\lim_{k→∞} A_k·e_N$ is mostly independent of where you start and gives you the eigenvector centralities.) You could combine these with a weighted sum such as:

    $$ \sum_{i=1}^∞ \frac{1}{i} A^i e_N. $$

    Analogously, you can also trace the flow backwards by looking at $A^{-1}·e_N$, $A^{-2}·e_N$, and so on. (If you do this, do not actually compute the matrix $A^{-1}$, but solve linear equation systems instead.)

What of this makes sense for you at the end of the day strongly depends on your application and research question. Only you can answer this.