I'd like to solve the following minimization problem
$$\min_{X_1,X_2} \mbox{nzc} (A+B_1X_1+B_2X_2)$$
where the $\mbox{nzc} (D)$ denotes the number of non-zero-columns in $D$, and where $X_i, A, B_i$ are matrices of appropriately chosen dimensions. Note that $\mbox{nzc} (D)$ is not a norm, rather it is somewhat similar to the $\ell_0$ "norm" which one can read about here, which is also not a norm.
How should I approach this? Is there a standard technique for solving optimization problems of this form?
Observe that $$ nzc(A+B_1X_1+B_2X_2) = nzr((A+B_1X_1+B_2X_2)^\intercal) $$ where $nzc$ and $nzr$ refer to the number of nonzero columns and rows respectively. Furthermore, the $\ell^{p,q}$ matrix norm for $1\leq p,q<\infty$ of an $m$ by $n$ matrix $M$ is given by $$ \|M\|_{p,q} = \bigg(\sum_{i=1}^n\bigg(\sum_{i=1}^m |M_{ij}|^p\bigg)^{\frac{q}{p}}\bigg)^{1/q}. $$ In particular, $$ \|M\|_{0,q} = |\{j \; | \;||M_j||_q \neq 0 \; \}| $$ where $M_j$ is the $j$-th row of $M$. Now recall that $$ ||M_j||_q \neq 0 \iff M_j = 0. $$ Hence, $\|M\|_{0,q}$ counts the number of non-zero rows of $M$. Therefore, $$ \min_{X_1,X_2}nzc(A+B_1X_1+B_2X_2) = \min_{X_1,X_2}\|(A+B_1X_1+B_2X_2)^\intercal)\|_{0,q}. $$
If you want to solve $\ell^0$ optimalisation problems, it is best to look at resources in the field of Compressed Sensing. One common technique there is to replace the $\ell^0$ with the $\ell^1$ norm. This is because $\ell^1$ is the lower convex envolope of $\ell^0$, and minimizers of $\ell^1$ tend to be sparse just like those of $\ell^0$. In this case that means solving $$ \min_{X_1,X_2}\|(A+B_1X_1+B_2X_2)^\intercal)\|_{1,q} $$ instead. Looking for algorithms like Bregman Iterations, thresholding algorithms or ADMM should give you something that might help you solve it efficiently.