I have two list of numbers of size N. For example, say N=6, x=[1, 3, 5, 10, 15, 20] and y=[1.5, 3.5, 4.7, 9, 12, 18]. I would like to divide these two lists into n bins (say n=3 here) in a way that numbers with same index in x and y end up in the same bins.
For examples, if I use 4 and 11 as the dividing points, I will get x=[[1,3], [5, 10], [15, 20]] and y=[[1.5, 3.5], [4.7, 9], [12, 18]] which satisfy my requirement.
In some cases, that may not be possible, so I would like to minimize the number of misses.
Is there a way to formulate this using some sort of optimization problem?
Assume that all $x$ and $y$ values are in the range $[m, M]$. Let $z_1,\dots,z_{n-1}\in [m,M]$ be real variables providing the dividing values for the bins, and for notational convenience let $z_0=m-1$ and $z_N=M$. Introduce binary variables $u_{ij}$, $v_{ij}$ and $w_i$ for $i=1,\dots, N$ and $j=1,\dots,n$. $u$ and $v$ will signal which interval each $x$ and $y$ value (respectively) falls into, and $w_i$ will be 1 whenever $x_i$ and $y_i$ land in different bins. The objective is to minimize $\sum_i w_i$.
We'll need a small positive constant $\epsilon$ to enforce strict inequalities. (In case an $x$ or $y$ value falls on one of the dividing points, we'll arbitrarily assign it to the lower interval.) The constraints are as follows: \begin{align*} \sum_j u_{ij} & = 1 \quad \forall i\\ \sum_j v_{ij} & = 1 \quad \forall i\\ \sum_i u_{ij} & \ge 1 \quad \forall j\\ \sum_i v_{ij} & \ge 1 \quad \forall j\\ x_{i} & \le z_{j}+(M-m)(1-u_{ij}) \quad \forall i,j\\ x_{i} & \ge z_{j-1}+\epsilon-(M-m)(1-u_{ij}) \quad \forall i,j\\ y_{i} & \le z_{j}+(M-m)(1-v_{ij}) \quad \forall i,j\\ y_{i} & \ge z_{j-1}+\epsilon-(M-m)(1-v_{ij}) \quad \forall i,j\\ w_{i} & \ge u_{ij}-v_{ij} \quad \forall i,j\\ w_{i} & \ge v_{ij}-u_{ij} \quad \forall i,j. \end{align*} Constraints 1 and 2 force each $x$ and $y$ to be assigned to exactly one bin. Constraints 3 and 4 force each bin to contain at least one $x$ and one $y$. Constraints 5 and 6 set $u_{ij}=1$ if and only if $z_{j-1} < x_i \le z_j$, and constraints 7 and 8 do the same for $v$ and $y$. Constraints 9 and 10 force a mismatch penalty if $x$ and $y$ in the same position land in different bins.