Given all zip codes in the US, their latitude and longitude. How can I calculate zip codes in between a certain mile radius?

257 Views Asked by At

Summary: I am creating a website, this site functions off a user searching a zip code and seeing results in that zip. I am trying to add a feature that allows them to see posts within a certain mile radius of their zip(feature you see on real estate sites, job boards, etc). As you can tell I spend most of my time on stack's programming sites, but this I feel is a math issue, so I am hoping someone can help me.

I am have a file with every zip code in the US and their respective latitude and longitude. For example: Zip Code = 99929 has latitude =56.370751 and longitude = -131.693301. So say a user has searched zip code '99929' and wants to see all posts within 20 miles of this zip code. If you read this far and already have a solution you can skip what is below, below is an explanation of what I have so far written in the Python programming language.

The following can take a starting latitude and longitude, along with an ending latitude and longitude, and calculate the distance between them:

slat = radians(float(input("Starting latitude: "))) # to explain this :input() gets the starting latitude, float() converts this to a floating point number and radians() converts this from degrees to radians
slon = radians(float(input("Ending longitude: ")))
elat = radians(float(input("Starting latitude: ")))
elon = radians(float(input("Ending longitude: ")))


#calculations
dist_in_km = 6371.01 * acos(sin(slat)*sin(elat) + cos(slat)*cos(elat)*cos(slon - elon))
miles_per_kilometer = 0.621371;
dist_in_miles = dist_in_km * miles_per_kilometer

Can anyone think of a way to implement this knowing the slat, slon and dist_in_miles before hand (20 miles in the above example), so that instead of calculating dist_in_miles it determines the elat and elon?

Thanks in advance for any help, this has been driving me crazy.

1

There are 1 best solutions below

0
On BEST ANSWER

Math answer:

Do calculations with the map as a globe rather than a rectangle. Distance between two grid points on a rectangular map of the world is messy and complicated since grid squares do not represent the same area despite often being pictured as taking the same area. Keep everything within a set distance from your location where distance is calculated directly.


Programming implementation example:

Assuming you are taking in your data about zipcode locations as a *.csv and it contains columns 'lat' and 'lon' to denote position (these can be changed easily enough to fit your specific needs)

import scipy
import pandas as pd
import numpy as np

def convert_latlon_to_cartesian(lat,lon):

   x = scipy.math.cos(lon * np.pi / 180) * scipy.math.cos(lat * np.pi / 180)

   y = scipy.math.cos(lat * np.pi / 180) * scipy.math.sin(lon * np.pi / 180)

   z = scipy.math.sin(lat * np.pi / 180)

   x *= 3959
   y *= 3959
   z *= 3959

   return (x,y,z)

def convert_cartesian_to_latlon(x,y,z):

   r = scipy.math.sqrt(x**2+y**2+z**2)

   lat = scipy.math.asin(z / r) * 180 / np.pi

   lon = scipy.math.atan2(y, x) * 180 / np.pi

   if lat > 180:
       lat -= 360

   return (lat,lon)

def cartesian_distance(x1,y1,z1,x2,y2,z2):
    return scipy.math.sqrt((x1-x2)**2+(y1-y2)**2+(z1-z2)**2)

with open('MyZipCodeFile.csv', mode='r') as csvfile:
   zipcodesdf = pd.read_csv(csvfile)

slat = input('What latitude-longitude would you like to search at?\nLat: ')
slon = input('Lon: ')
dist = input('What mile radius would you like to limit results to?\nDist: ')

sx,sy,sz = convert_latlon_to_cartesian(slat,slon)

temporarydf = zipcodesdf
temporarydf['x']=''
temporarydf['y']=''
temporarydf['z']=''
temporarydf['dist']=''

#populate new fields with cartesian (x,y,z) coordinates and calculate distance from original point
for index,row in temporarydf.iterrows():
    row['x'],row['y'],row['z'] = convert_latlon_to_cartesian(row['lat'],row['lon'])
    row['dist'] = cartesian_distance(sx,sy,sz,row['x'],row['y'],row['z'])

#get back a dataframe containing only entries within set distance of target
outputdf = temporarydf.loc[temporarydf['dist'] <= dist]