Edit: There were 2 important logic errors in the code below. They have been fixed!
update: I still don't have an answer to this question, but I recently made a massive improvement to my current program (found at):
This code requires java, and uses multithreading to sort through all possible combinations faster. With a decent CPU having 4 processors it can run $2^{24}$ combinations in 5 seconds. I will be happy to explain its various functions if anyone asks.
The solution to this question would allow efficient elimination of missing data (to check for the significant of their missing data through various imputation methods) and simultaneously allow an individual to check if their data is missing at random.
I recently completed a post on SE (Maximization of a ratio) with my own answer; while I do not know if it is correct, this post plays an important part for a separate question. Specifically, given a random N x M matrix with N $\le$ M - where each entry $a_{i,j}$ is randomly assigned the value of 1 or 2 - what is the maximum number of rows (n $\le N)$ with probability P that a computer program must check every combination of in either the original matrix OR its transpose (n $\le$ M) to find the largest (N - a) x (M - b) matrix containing only 1s.
Here's an example:
1 2 1 2 2 2 1 2
1 2 1 2 2 1 1 2
2 1 2 1 2 2 2 2
2 2 1 2 2 2 1 2
Given this matrix, there are two solutions (deleting rows 3, 4 and columns 2, 4, 5, 6, 8 to yield a 2 x 3 matrix OR deleting row 3 and columns 1, 2, 4, 5, 6, 8 to yield a 3 x 2 matrix) which are:
1 1
1 1
1 1
and
1 1 1
1 1 1
As one can see, although there are 4 rows, I need only check combinations of up to 3. This is not a huge saving, but imagine a 200 x 200 matrix. It becomes very important to know after how many rows (and whether I should use the original matrix OR its transpose) I should stop checking combinations based on the number of 1s and their distribution in this 200 x 200 matrix.
The solution above (Maximization of a ratio & if correct) is concerned only with the prospect that a 1 is only adjacent to a 2 if it's on the perimeter of a block containing exclusively 1s.
i.e.,
1 1 1 2 2 2 2 2
1 1 1 2 2 2 2 2
1 1 1 2 2 2 2 2
1 1 2 2 2 2 2 2
x = 11, N = 4, $d_B$ = 3 and B = 3
Find B, $d_B$ $\in$ N with B $\le$ M such that B*$d_B$ $\le$ x, B*($d_B$+1) $\gt$ x, $d_B \le$ N and $d_{B-1} \gt$ N are all true where x equals the total number of 1's in the N x M matrix. While it works in the above example, there will be cases where it fails miserably unless I also check every combination of n rows or fewer in the transpose of the matrix in question.
i.e.,
1 1 2 1 2 2 2 2
1 1 2 2 1 2 2 2
1 1 2 2 2 1 2 2
1 1 2 2 2 2 2 2
However, checking both the original matrix and it's transpose (depending on the # rows) is likely to be less efficient if N $\ne$ M than just checking all combinations of N rows. How can I perfect my solution to (Maximization of a ratio) to work for a two dimensional matrix, and how can I decide whether the original matrix is more efficient?