I am working on an application where the following problem has come up:
Given $N$ items and penalty matrix $A_{ij} \in \mathbb{R}^{N \times N}$ where
$a_{ij} \in A_{ij}$
is the penalty for grouping item $i$ and item $j$ together.
The objective is to group the items in $k$ different groups such that the total penalty is minimized. I have formulated this as an optimization problem with binary variables
$y_{ij} = 1$ if item $i$ and item $j$ are grouped together
and objective function
$\min. z = \sum_{ij} A_{ij} y_{ij}$
The problem is that $A$ is very large and dense with the order of 900 million entries ($N$ = 30,000). The double index-variables (and associated constraints) make my algorithm run out of memory.
Is there a known OR/clustering problem similar to this, with an algorithm that runs in polynomial time? Or am I just using a poor formulation?
Thanks.