Given a block-diagonal matrix $\mathbf{D} \in \mathbb{R}^{m \times n}$ ($m > n$), $\nabla: \mathbb{R}^n \rightarrow \mathbb{R}^n$ (assuming periodic boundary condition), and $\mathbf{b} \in \mathbb{R}^m$, how to solve for $\mathbf{x} \in \mathbb{R}^n$ efficiently (I mean, fast convergence and parallelizable)?
$\underset{\mathbf{x}}{\text{min}} \quad \| \mathbf{D\nabla x - b}\|^2_2$.
My current thought is to solve the normal equation using conjugate gradient:
$ \nabla^{\text{T}}\mathbf{D}^{\text{T}}\mathbf{D}\nabla \mathbf{x} = \nabla^{\text{T}}\mathbf{D}^{\text{T}}\mathbf{b}$.
$\mathbf{D}^{\text{T}}\mathbf{D}$ will still be a block-diagonal matrix, but $\nabla$ operator cannot be efficiently evaluated in Fourier domain any more. And the problem structure is broken.
Is there any better numerical algorithms for this special problem?