How can I geometrically (or geographically) group items together?

96 Views Asked by At

I'm a programmer, and I'm working on a project that takes a bunch of photos and separates them into groups by their gps coordinates. I have no experience in things like geometric group theory so I'm not even sure if that's the field that would help me with this project, but regardless, I just want to figure out how mathematically (and then programmatically) I can decide when a photo should be in the same group as others.

Obviously the easy way to do it would be to say that if a photo was taken within a certain distance of another photo it should be in the same group. However, realistically, some groups will span a greater geographical area (e.g., photos taken on a boating trip around a lake would all be in one group, but photos in the small area of my house would be in a different group than those taken in the house down the street--even though the geographical span of the lake would surround those of my house and that down the street.

Along with the geographical grouping, I plan to group my pictures through time as well as a way to narrow the groups further (e.g., photos taken at the corner restaurant today in a different group from those taken next week in the same restaurant.

I guess the trick for me that I'm having a hard time coming up with is how to decide how big of a span those groups should be. If i'm looking at a map with a bunch of points, or a timeline with a bunch of points, it's easy to draw lines to group things off. But how to mathematically/programmatically do so? I'm sure it has something to do with how many items there are in a geographical span (e.g., 100 items spread out along a km length of street should be together, but 2 items at either end of the same street with nothing in between should be in two different groups) but I'm still at a loss of where to go from here.

Thanks for your help!

1

There are 1 best solutions below

1
On BEST ANSWER

What you want to do is called clustering and in a general setting it's a hard thing to do. The subject is well studied, although your particular application might have some properties that would allow you to do something better, I recommend you to apply first some known methods and just then tweak it or make some hybrid approach.

It is worth to mention that the more (reasonable) signals/information you get (for example you could use some kind of pattern recognition to join pictures done at sea even if those are distributed through time), the better your partition will be.

Finally, be aware that there is no true way to cluster objects, and three main reasons are:

  • You do not know how fine the clustering should be, would you like to join all the pictures at sea together, or maybe split the voyage into smaller sections?
  • You do not know which dimension you would like to bring up first, should it be voyage, or maybe the pictures of the cat that had traveled all along should be grouped separately?
  • Probably users would like one picture to belong in many clusters, this might be easier or harder, depends on situation (but it is worth to check if you would like to have that kind of functionality in the future).

In your case it's a bit easier since you already stated that you want to use geographical data, but don't be fooled, you will have a lot of "fun" with just location/timing balancing.

I hope I didn't scared you and wish you good luck! ;-)