How can I find a villain's hideout given a set of previous locations? (Or, how can I identify the centeroid of a cluster of datapoints?)

Question

How can I find a villain's hideout given a set of previous locations? (Or, how can I identify the centeroid of a cluster of datapoints?)

148 Views Asked by Bumbble Comm At 28 Mar 2026 - 3:21

Imagine this... Batman has just retrieved a tracking device he placed on The Joker 150 days ago. The good news is that it has 150 coordinates — one from each day. The bad news is that all the data is randomly sorted — there's no way to tell when the coordinates were recorded, nor their sequence. Further, all the data was collected at random times during the day so we can't even be sure any of the points were actually taken at the hideout — it might very well be in between some of them. How can we help Batman find the secret hideout?

Here's a map of the dataset: http://batchgeo.com/map/c3676fe29985f00e1605cd4f86920179

Here's a pastebin of raw 150 geocodes: http://pastebin.com/grVsbgL9

In math terms, I'm looking for help identifying the centroid of a complex cluster of data. As you'll notice in this data set, there are several clusters (San Francisco, LA, Chicago and NYC) along with lots of noise throughout the rest. I need to determine which cluster is primary, and identify the centroid of this cluster.

Can you recommend a strategy? Preferably one with some meat I can use to begin analyzing the data for the "secret hideout"? ;)

Original Q&A

There are 2 best solutions below

**Bumbble Comm** · Answer 1 · 2013-06-20 04:42:09

Here's a heuristic that has no scientific basis whatsoever (as far as I know). It's virtue is that it's easy to program.

Let $d_{ij}$ be the distance from point $P_i$ to point $P_j$, $(1 \le i,j \le n)$.

(1) Compute the average $d$ of all the $n^2$ $d_{ij}$ values.

(2) Choose some factor $k$; I'd suggest around 0.1, but you can experiment.

(3) Let $r=k*d$ be a "threshold" radius.

(4) For each $i$, find the count $c_i$ of other points that are within a distance $r$ from point $P_i$.

(5) Any point $P_i$ that has high value for $c_i$ is a good candidate for the hide-out, because it has lots of other points nearby.

If you think you can guess a good value for $r$, then you can skip steps (1) and (2).

**Bumbble Comm** · Answer 2 · 2013-06-20 05:20:28

If I'm not mistaken, you are looking for the geometric median of a set of points.

One way to approach this is to build a physical model as described in the Background Example here. I believe this technique dates back to Gauss, who described it in an unpublished letter to Schumacher (for the case in which there are only $4$ different points; but it extends naturally to the case with many points).

With regard to tackling the problem computationally, you might try Weiszfeld's algorithm. See, for example, this wikipage section, which includes references to more recent approaches as well.

How can I find a villain's hideout given a set of previous locations? (Or, how can I identify the centeroid of a cluster of datapoints?)

There are 2 best solutions below

Related Questions in STATISTICS

Related Questions in ALGORITHMS

Related Questions in CLUSTERING

Trending Questions

Popular # Hahtags

Popular Questions