How do you find the mean proximity of two clusters using Manhattan Distance way or the Euclidean Distance way?

61 Views Asked by At

Question

enter image description here

Solution enter image description here

I don't understand how the mean proximity is calculated here like it says take the average of the $x$ components then add it with the average of the $y$ components of these $16$ distances. From my understanding I thought it was like taking all the values of the $x$ coordinates and getting their average and then doing the same for the $y$ coordinates and then finally adding them together but that's not the case. In general, how does one compute the mean proximity of $2$ clusters using the Manhattan Distance?

1

There are 1 best solutions below

3
On

That’s a terrible text you have there; it’s full of grammatical mistakes and it adds the distances in a different order than it listed them before. If I were you I’d switch to a different text and/or course if at all possible.

The mean distance between the clusters is the distance averaged over all pairs. For the Manhattan distance, this is

\begin{eqnarray} |A-B|&=&\frac1{16}\sum_{ij}|a_i-b_j| \\ &=& \frac1{16}\sum_{ij}\left(|a_{i1}-b_{j1}|+|a_{i2}-b_{j2}|\right)\;. \end{eqnarray}

The solution you quote computes this as

$$ \frac1{16}\sum_{ij}|a_{i1}-b_{j1}|+\frac1{16}\sum_{ij}|a_{i2}-b_{j2}|\;. $$

It’s unnecessarily hard to understand because of the change in order. For instance, $|a_{11}-b_{41}|=5$, $|a_{21}-b_{41}|=5$ , $|a_{31}-b_{41}|=7$, $|a_{41}-b_{41}|=7$, which yields the first term in the solution.