What is an example of a nearest neighbor relationship being non-symmetric?

383 Views Asked by At

Context

I'm the new guy on my team, and the one with the most (but not much) math knowledge. I've been tasked to check an intern's work on implementing spectral clustering correctly in code. In that direction, I'm reading this pdf about spectral clustering and the kinds of similarity graphs that can be used in practice. It mentions a KNN graphs with the caveat that the neighborhood relationship is not symmetric, so the resulting graph is directed and needs to be made so.

So I read about nearest neighbor graphs:

The nearest neighbor graph (NNG) for a set of n objects P in a metric space (e.g., for a set of points in the plane with Euclidean distance) is a directed graph with P being its vertex set and with a directed edge from p to q whenever q is a nearest neighbor of p (i.e., the distance from p to q is no larger than from p to any other object from P).1

Which had me read about neighborhoods, open balls and open sets.

All of this is to say that I hope I've done enough work/research to ask for an example:

My question:

When (or under what conditions) is a point $p$ a neighbor of $q$ but $q$ not a neighbor of $p$? Let's say that all these points are in $\mathbb{R}^2$ and the measure is symmetric, like Euclidean or Manhattan distance, which is the case for my particular application.

If you can't tell from my wording, I'm not a math genius. Answers that don't rely on an advanced understanding of topology or complicated notation are appreciated. A simple example will do. I'm okay with basic proofs too.

2

There are 2 best solutions below

3
On BEST ANSWER

Other folks answered your real question, but I figured I'd also answer the question you asked.

If we are in a space $X$ with metric $d$, and we define (a basis for) the open sets of this space to be the $\epsilon$-balls about all points in $X$, $$ B_d(x,\epsilon) = \{ y \, | \,d(x,y) < \epsilon \}, $$ then because of the symmetry property of the metric $d(p,q) = d(q,p)$, it follows that $p \in B_d(q,\epsilon) \Leftrightarrow q \in B_d(p,\epsilon)$. In other words, if you define your neighborhoods relative to the metric on a space, then having $p$ in a neighborhood of $q$ implies that $q$ is in a neighborhood of $p$.

As is pointed out in Noah Schweber's answer, the "nearest" part of the definition of a "nearest neighbor graph" is what makes the relation non-transitive. In the notes you linked on spectral clustering, the notion of an "$\epsilon$-neighborhood graph" (p. 3) corresponds to the notion of two points being within some standard-size neighborhood of each other without necessarily being each other's nearest neighbors; and the relationship so created is symmetric.

There are other ways to define neighborhoods than using a metric function, but in general they still have this property. In the context of point-set topology, we just have a space $X$ and a collection of subsets of $X$ that we call "open sets", along with certain axioms (the union of open sets is an open set, etc.) In this context, a "neighborhood of a point $p$" is just an open set containing $p$. But even in this context, saying that "$q$ is in a neighborhood of $p$" implies that "$p$ is in a neighborhood of $q$", since both statements are just saying that there exists an open set containing both $p$ and $q$.

1
On

We don't even have to go to $\mathbb{R}^2$ - this already happens in $\mathbb{R}$!

Consider for example $0,1,$ and $1000$ as points on the number line. Clearly the nearest neighbor of $1000$ is $1$, but the nearest neighbor of $1$ is not $1000$.

Note that thinking about this more, it's in fact very rare for the nearest neighbor relation to be symmetric. A good exercise is to figure out exactly when the nearest neighbor relation in a set of three points on the number line is symmetric. (Note that there's a subtlety here around what happens if one point is equidistant from the other two.) For simplicity, first consider the case where the first two points are at $0$ and $1$, and the third point is $>1$; the general idea will be the same.