calculating the median in iWorks Numbers from a dictionary of values

168 Views Asked by At

I need to calculate the median among some numbers which I have in iWorks numbers. Since I have to do this several times (a few hundreds) I really cannot do it by hand. The numbers reach me (from the website) as values in n columns. With cell i, j holding the number of people who voted j at the question i. I need to extract the media. In iWorks Numbers they offer the Median function.

The MEDIAN function returns the median value in a collection of numbers. The median is the value where half the numbers in the set are less than the median and half are greater.

MEDIAN (num-date-dur, num-date-dur…)

So basically to use this function I would have to have a list of what the people have voted. While I have in a way similar to a dictionary. For example those votes as a list (2,1,2,2,2,1,0,1) are expressed in dictionary form as (1:0, 3:1, 4:2) and what I actually receive is (1, 3, 4) on the columns 0, 1, 2.

1

There are 1 best solutions below

0
On

Not having iWorks, I will explain the process in a way that should be possible to implement in any software that claims to handle spreadsheets. For ease of references, here is a screenshot:

bot

  • Row 1 contains numbers 0,1,2,3...
  • Row 2 contains the number of people who voted 0,1,2,3... For example, 2 people gave the vote of 3. This is your input data.
  • Row 3 contains cumulative sum of row 2. I put 0 in A3 and the rest of the row is calculated with B3=A2+A3, C3=B2+B3, etc.
  • Row 4 is where the magic happens. The magic is in the conditional statement =IF(A3<=$G3/2,IF(B3>=$G3/2,A1;""),""), and similar in other columns. The conditional statement checks whether the 1st row value (say, $x$) is such that no more than half of people voted $<x$, and no more than half of people voted $>x$. Such values get recorded in row 4. If the conditional statement is not satisfied, the cell is left blank.
  • Row 5 has a single formula G5=AVERAGE(A4:F4), which returns the median. Here it is crucial that the Average function ignores non-numeric cells.

As another test, increasing the number of people who voted 5 shifts the median:

enter image description here

(In principle, one could get rid of 3rd row at the cost of a more complicated, and less computation-efficient, conditional statement in row 4.)