Inferring ordering of elements of parameter from responses

16 Views Asked by At

Consider an agent with a reward $r(x,q) = x^\top q$ where $q$ is assumed to be unknown (but we can assume it comes from a known prior $p_0$). The agent is presented with pairs $(x,y)$ and makes the choice $c=x$ with some known probability $p_{xyq}$ (and $c=y$ with probability $1-p_{xyq}$).

I am interested in maintaining a belief on the relative magnitudes of the elements of $q$. That is, if $q=(q_1,q_2,q_3)$ then the belief is on the set of all total orderings $\mathcal{O}=\{(1,2,3), (1,3,2), \ldots, (3,2,1)\}$, where $(1,2,3)$ represents the case where $q_1>q_2>q_3$. Assume that no two elements of $q$ are equal.

Given a set of data $\{(x_i,y_i,c_i)\}$, is it possible to use Bayes rule to form a probability distribution on $\mathcal{O}$?