Currently, I am minimizing the quadratic objective $\|\mathbf{A}\mathbf{X}\mathbf{B}\mathbf{d} -\mathbf{c} \|^2$ using CVX, as follows
echo on
cvx_begin
variable xx(num_triangles*3,num_triangles*3)
minimize( norm( A * xx * B * d - c ) )
cvx_end
echo off
However, $\mathbf{X}$ is a very large matrix (about $50,000 \times 50,000$, which is too big). Good news is that $\mathbf{X}$ is a block diagonal matrix (of $3 \times 3$ transformation matrices) and, therefore, it is highly sparse.
To give an example, for $\mathbf{X} \in \mathbb{R}^{9 \times 9}$ I would be looking for such matrix.
| X1 X4 X7 0 0 0 0 0 0 |
| X2 X5 X8 0 0 0 0 0 0 |
| X3 X6 X9 0 0 0 0 0 0 |
| 0 0 0 X10 x13 x16 0 0 0 |
| 0 0 0 X11 x14 x17 0 0 0 |
| 0 0 0 X12 x15 x18 0 0 0 |
| 0 0 0 0 0 0 X19 x22 x25 |
| 0 0 0 0 0 0 X20 x23 x26 |
| 0 0 0 0 0 0 X21 x24 x27 |
so in fact I am only solving for $27$ variables, not $9 \times 9 = 81$. And this scales pretty badly.
But I don't know how I can enforce such structure when formulating the minimization problem. If I declare a normal square matrix of the size I need, the memory just explodes. Any ideas on how to reformulate the problem in a way that I actually solve for the number of unknowns? I currently look for solutions in MATLAB.
Consider for simplicity the case where $\mathbf{X}$ is of the form
$$ \mathbf{X} = \left[ \begin{matrix} \mathbf{X}_{1,1} & \mathbf{0} \\ \mathbf{0} & \mathbf{X}_{2,2} \\ \end{matrix} \right]. $$
Writing $\mathbf{A}$ as block matrix
$$ \mathbf{A} = \left[\begin{matrix} \mathbf{A}_{1,1}& \mathbf{A}_{2,1} \\ \mathbf{A}_{1,2} & \mathbf{A}_{2,2} \\ \end{matrix} \right], $$
the product $\mathbf{AX}$ can be written as
$$ \mathbf{AX}=\left[ \begin{matrix} \mathbf{A}_{1,1}\mathbf{X}_{1,1} & \mathbf{A}_{2,1}\mathbf{X}_{2,2} \\ \mathbf{A}_{1,2}\mathbf{X}_{1,1} & \mathbf{A}_{2,2}\mathbf{X}_{2,2} \\ \end{matrix} \right]. $$
This approach generalizes to $\mathbf{X}$ having arbitrary number of diagonal matrix elements.
Your best bet is to declare a matrix variable corresponding to each $\mathbf{X}_{i,i}$ and perform the multiplication $\mathbf{AX}$ manually, as shown above (followed by right multiplication with $\mathbf{Bd}$). This approach will eliminate the need for declaring a full size matrix variable for $\mathbf{X}$, as well as saving computations corresponding to multiplication with zeros. However, as the matrix sizes you are considering are indeed very large, this approach may also fail.
A more convenient approach would be to exploit CVX features related to block-diagonal and/or sparse matrices. However, I am not familiar with the program. You should also try asking Computational Science SE or Stack Overflow SE.