Calculating the best match between two sets

Question

Calculating the best match between two sets

860 Views Asked by Bumbble Comm At 28 Mar 2026 - 7:05

I’m a PHP developer and I have a problem calculating the perfect match between two different data sets.

I have data sets from companies, where each company defines the requirements for a specific job. I also have data sets from users, where each user can define a data set describing their skills.

Both of this datasets could hold values between $1$ and $12$.

Here’s an example of two data sets:

$\begin{align*} \text{Company} & \to [\phantom{1}4, \phantom{1}8, 12, \phantom{1}4, 10] \\ \text{User} & \to [\phantom{1}8, 10, \phantom{1}5, \phantom{1}5, \phantom{0}1] \end{align*}$

Question:

What is the best way to calculate the best matching job from a company? There were two thoughts that crossed my mind, but I don’t know which would be better, of if indeed there’s another completely different approach.

Calculate the sum of all absolute diffs. $\newcommand{\abs}{\operatorname{abs}}$

For example: $$\text{score} = \abs(4-8) + \abs(8-10) + \abs(12-5) + \abs(4-5) + \abs(10-1) = 23$$
Calculate the absolute diff between the sum of both data sets.

For example: $$\text{score} = \abs\left[(4+8+12+4+10)-(8+10+5+5+1)\right] = 9$$

Original Q&A

There are 2 best solutions below

**Bumbble Comm** · Answer 1 · 2014-08-13 11:32:02

Answer:

One approach could be that you could give weights to each of these skill and find the weighted variance in the sense that in your example,, the five jobs could have been weighted $w_1,w_2,w_3,w_4,w_5$, and let us say for example sake, the weights are 0.2,0.1,0.3,0.15,0.25.

Since it is the best match, we could find the mean of absolute differences, M, ${\sum_{1}^{n} w_i|x_i - x_j|}$

Then find the weighted variance = $\sum_{1}^{i=j} w_i\left(|x_i-x_j| - M)^2\right)$. In your example, M$ = 5.5$

Absolute differences are,$d_1 = 4, d_2 = 2,\cdots, d_5 = 9$

The best matched job could be the least of the following metric(weighted variance): $$\left((0.2).(4-5.5)^2+0.1.(2-5.5)^2+\cdots+0.25.(9-5.5)^2\right) = 8.45$$.

I tried with the following datasets

Company = {3,4,5,8,6}, User = {3,4,5,8,6} = Variance =0 , No change

Company = {4,4,5,8,6}, User = {3,4,5,8,6} = Variance = 0.16, One change

Company = {4,3,5,8,6}, User = {3,4,5,8,6} = Variance = 0.21, Two Changes

Kind of worked for these examples.

You could try this in your job and let me know if it worked to your satisfaction.

enter image description here

The below table shows the comparison of weighted variance and cosine similarity under two different but similar situations. The second table has wide difference in company and user rating of the task 1 with weight 0.05 and remainder of the tasks have the same rating. The third table has the same wide difference in company and user rating of the task 4 with weight 0.80 and the remainder of the tasks have the same rating. Cosine similarity will treat them both the same way while weighted variance will best match the second situation than the third and intuitively it is correct, because the task that mattered most in the second table has no difference and the task that mattered the least had a wide difference thus getting higher priority than the third table where the task that mattered the most had the same difference as that of the task 1 in table 2. Thus weighted variance gives you additional flexibility in rank ordering similar datasets.

enter image description here

Thanks

Satish

**Bumbble Comm** · Answer 2 · 2014-08-13 14:00:09

There are probably many ways to archive a meaningful solution for this problem. I used the cosine similarity

With the example above i get following function:

$\frac{((4*8) + (8*10) + (12*5) + (4*5) + (10*1))} {\sqrt{((4*4) + (8*8) + (12*12) + (4*4) + (10*10))} \times \sqrt{((8*8) + (10*10) + (5*5) + (5*5) + (1*1))}} = 0.74712398867 $

Calculating the best match between two sets

There are 2 best solutions below

Related Questions in STATISTICS

Related Questions in ABSOLUTE-VALUE

Trending Questions

Popular # Hahtags

Popular Questions