Finding median value based on specific ranges and percentages

44 Views Asked by At

This might be something really easy or something impossible, so sorry if my question does not make sense though I feel like it should be done somehow easily.

I have the following set of given data:

Earnings (dollars) % of population
Less than 10,000 6
 10,000 to 14,999 3.9
15,000 to 24,999 7.5
25,000 to 34,999 7.8
35,000 to 49,999 11.3
50,000 to 74,999 16.8
75,000 to 99,999 12.8
100,000 to 149,999 16.3
150,000 to 199,999 7.9
200,000 or more 9.8
- -
Median income 69717
Mean income 97962

What I need to find out is the median value of a specific bracket, let's say for people earning 50k+. I know median means the middle of the range so, I can estimate the median by halving the sum of the percentages of people earning 50k+ as follows:

(16.8 + 12.8 + 16.3 + 7.9 + 9.8) / 2 = 31.8

31.8 correlates with "a bit more" of (16.8 + 12.8) = 29.6 , so I can roughly estimate the median earning of people that earn 50k+ is "a bit more" than 100k. I basically need to transform this manual "correlation" into a formula.

I will have the same type of table for hundreds of times with different data, in MS Excel, and I need a formula that can estimate the median of a specific bracket with as little error as possible.

I hope I was clear enough and thank you in advance for any input!!

1

There are 1 best solutions below

0
On

If you take the cumulative sum of the percentages of your table, you get a sequence of values of the CDF at some number, say $N$, of discrete points. If you have enough points, you can attempt to approximate the CDF by fitting it to an appropriate family of functions defined by no more than $N$ parameters (being careful about the requirement that a CDF is necessarily non-decreasing and that it attains the values 0 and 1 at the endpoints of the domain on which it is defined). After you have done that, the kinds of median calculations you want to do can be done either analytically (for some families) or at least numerically.