Given 500 parts and a list of orders, pick 50 parts to maximize the number of fulfillable orders

134 Views Asked by At

I'm going to start with a proclamation that this kind of optimization is new to me, so don't fault me for setting up the problem in a weird way. Please let me know if this is unclear.

In a warehouse, a manufacturer has 500 parts. Some of the parts are retrieved individually, but many orders require combinations of parts. We want to put the 50 "most important" parts in the first row--by picking the right parts, we will maximize the number of orders that can be fulfilled with one visit to the first row.

There are $n$ = 500 candidate parts. The 50 we choose can be represented in the $n$ X 1 vector $P$. E.g.,

$\begin{bmatrix} 1 & 1 & 0 & 1 &1&1&0&1&0& ...&0 \end{bmatrix}$, such that $\sum_1^n{P_i} = 50$.

Each order can be represented by a row in an $m$ X $n$ matrix, where $m$ = 5000, the number of orders. Let's call it $O$. It looks something like this:

$\begin{bmatrix} 0 & 1 & 1&1&0&0&0&0&0&... \\ 1 & 1 & 1&0&1&0&0&0&0&.. . \\ 1 & 0 & 1&0&1&0&0&0&0&... \\ ...&...&...&...&...&...&...&...&...&... \end{bmatrix}$

So, we want to maximize $OP'$. Kind of. It's more like we want to multiply each row in $O$ by the column vector $P$, and iff $PO_m = \sum{O_m}$, then a row "counts". Put another way, a row (an order) in $O$ counts toward our sum if each of its component parts can be found in the part vector $P$.

What is a good way to find the best part vector?

1

There are 1 best solutions below

0
On

The problem is very likely difficult (expensive) to solve in generality. Without data it is hard to predict how difficult a "real world" instance might be.

I'm not sure what search terms would give us literature references. The best I came up with was "clustering" of binary vectors, since the orders that can filled by the fifty bins in the front row are in that sense more closely related to each other than to orders that cannot be filled there. One research paper about this is A General Model for Clustering Binary Data by Tao Li (2005).

To visualize the problem (and solution), consider that we are asking for row and column permutations of the $m\times n$ binary matrix $O$ yielding an equivalent block triangular form:

$$ \begin{bmatrix} A & 0 \\ E & F \end{bmatrix} $$

Here block $A$ has (say) $50$ columns and the maximum possible number of rows subject to the restriction that all entries outside those columns are zeros. The columns that define $A$ correspond to the unit entries of the "part vector" $P$ designated in the Question.

We propose a recursive, backtracking method for exhaustive (exact) searching. Each stage $s=1,\ldots$ includes or excludes at least one order (row of $O$) and depends on backtracking to assure the optimal choice of parts (columns).

Intially we associate with the binary matrix $O$ an empty set $\mathscr{R}_0 = \{\;\}$ of rows and an empty set $\mathscr{C}_0 = \{\;\}$ of columns. We recursively build up the row sets $\mathscr{R}_s$ and column sets $\mathscr{C}_s$ through a succession of "choice points" at which an available row is either included or excluded, revisiting these choices until the search space has been exhausted. Let $k=50$ be a problem parameter that limits how many columns we allow in our columns sets $\mathscr{C}_s$.

Let $A_s$ denote the submatrix of $O$ defined by rows $\mathscr{R}_s$ and columns $\mathscr{C}_s$. The corresponding submatrices $E_s$ and $F_s$ are defined by comparison with the block triangular form visualized above. For convenience we say $F_0 = O$.

At the beginning of stage $s \ge 1$, identify the original row index $i_s$ of $O$ that corresponds to the lexicographically greatest row in $F_{s-1}$ among those having the largest number of nonzero entries so that these columns could be adjoined to $\mathscr{C}_{s-1}$ without exceeding the limit $k$. As a practical matter, the rows of $F_{s-1}$ whose nonzero entries would increase those columns beyond limit $k$ can be dropped (omitted from $F_{s-1}$) without affecting the ongoing nested levels of recursion.

If no such row exists, we have reached the end of recursion. Taking note of the number of rows $|\mathscr{R}_s|$ in $A_s$ attained, we then backtrack to the most recent choice point and resume (if possible). The final solution will correspond to any $A_s$ that attained the maximum row size $|\mathscr{R}_s|$.

Given an original row index $i_s$ as above, we define a choice point either including that row in the solution or excluding it.

If row $i_s$ is included, then we define:

$$ \mathscr{C}_s = \mathscr{C}_{s-1} \cup $\{ j | O(i_s,j) = 1 \} $$

which by construction above satisfies $|\mathscr{C}_s| \le k$. We also define:

$$ \mathscr{R}_s = \{ i | O(i,j) = 1 \implies j \in \mathscr{C}_s \} $$

Note that by construction $i_s \in \mathscr{R}_s$ and, because $\mathscr{C}_{s-1} \subset \mathscr{C}_s$, also $\mathscr{R}_{s-1} \subset \mathscr{R}_s$. It is possible $\mathscr{R}_s$ may be larger than just $\mathscr{R}_{s-1} \cup \{i_s\}$.

If instead the row $i_s$ is excluded, then we remove it from $F_{s-1}$ and any other identical rows of $F_{s-1}$, thereby guaranteeing that none of these will ever become accepted into the solution. If another choice of $i_s$ is possible from the reduced rows of $F_{s-1}$, consistent as before with the expansion of $\mathscr{C}_{s-1}$ to $\mathscr{C}_s$ staying below the limit $k$ of columns, this alternative is considered in turn for the purpose of recursion.

Eventually for a given stage $s$ all the possible rows $i_s$ will have been considered, and from there we backtrack to alternatives, if any, remaining at stage $s-1$.