Ranking objects using least squares

54 Views Asked by At

I need to develop an application to automatically rank objects. This is the use case: I have a set of objects, all of which have the same set of properties. For example, a set of cars, all of which have weight, horsepower, tank volume etc. I want to assign different weights to these properties, but initially all I know is the order in which the users want these objects to be in. We can assume that this initial set is representative of the whole domain.

My idea is to store the data in a matrix $A$, in which the rows represent the objects and the columns represent the properties. The rows are in ascending order. Then I create a vector $b = (1, 2, \ldots, n)$ and solve the equation $Aw = b$ by least squares, where $w$ is the vector of weights. Once I have these weights, I can add or remove objects from my set, and I can find the new ranking by just computing the linear combination for each object and then sorting them. In other words, I compute a new $b$ every time there is a change in the set.

Then comes the second requirement: users may want to reorder the objects at any time. To implement this, I'm thinking about permuting the rows of $A$ according to the new order provided by the users, but keeping $b$ as it is, and recomputing $w$.

Does this idea make sense?

1

There are 1 best solutions below

2
On BEST ANSWER

Another alternative. First, to reiterate what I say below: let $a_i$ denote the $i$th row of $A$ as a column vector, i.e. the $i$th column of $A^T$. A weight vector $w$ will rank row $a_j$ as higher than $a_i$ if and only if $(a_j - a_i)^Tw > 0$.

Now, let $B$ denote the matrix $M$ whose rows are all vectors of the form $(a_j - a_i)^T$ for which $j > i$. We're looking for a vector $w$ such that, ideally, every entry of $Bw$ is positve. Note that such a $w$ specifies a hyperplane that separates the columns of $B$ from the vector $0$. By the hyperplane separation theorem, this will be possible if and only if the vector $0$ is not an element of the convex hull of the rows of $B$.

With that established, we can definitively determine whether such a $w$ exists by solving the convex optimization problem $$ \min \|B v\|^2 \quad \text{subject to}\quad v_1 + \cdots + v_n = 0 \text{ and }v_i \geq 0 \quad i = 1,\dots,m. $$ If this minimum is non-zero and $v_*$ is the vector $v$ that minimizes $\|Bv\|^2$, then taking $w = Bv_*$ will give us weights that produce the desired ranking. If the minimum is $0$, then no such vector exists.


Here is an alternative method to consider.

Suppose that $A$ has size $m \times n$, and that the rows of $A$ are sorted according to the desired ranking. Set $w$ equal to the vector $w = A^Tv$, where $$ v = (1-m, 3-m,\dots,m-3,m-1)^T. $$ You could normalize the weights however you like without affecting the resulting ranking. For instance, you could divide each entry of $w$ by $\sum_{k=1}^m |w_k|$.


The justification is as follows: let $a_i$ denote the $i$th row of $A$ as a column vector, i.e. the $i$th column of $A^T$. A weight vector $w$ will rank row $a_j$ as higher than $a_i$ if and only if $w^T(a_j - a_i) > 0$.

Let $S$ denote the set of all vectors of the form $a_j - a_i$ for which $j > i$. We're looking for a vector $w$ such that, ideally, $w^Tx > 0$ for every $x \in S$. My conjecture is that a good candidate for such a vector (if one does exist) is the sum $$ w = \sum_{x \in S} x. $$ This is the vector $w = A^Tv$ that I describe above.