Let $v$ and $w$ be 2D images. I need to find $v$ such that
$$\text{minimize} \quad \sum \frac{r_1}{2} (v - w)^2 + \sum \frac{r_2}{2} \|\nabla v - q\|^2 $$
where $r_1, r_2 > 0$ are scalars and the sums are over pixels. The second term corresponds to the sum of the $L_2$ norm squared of the distance between the gradient of $v$ and $q$.
Then the Lagrange-Euler equation becomes:
$(-r2 \Delta + r1)v = -r2 \nabla \cdot q + r1w$
Which should be solved with a discrete Fourier method.
I've implemented it like this in Python (I got the theory here https://math.mit.edu/~stevenj/fft-deriv.pdf);
The idea is that the equation's right side is known and I just have to take the FFT of it. However I take advantage of spectral interpolation to compute the divergence of q instead of using finite differences.
Then I divide by the Laplacian operator in Fourier domain (normally not at the zero frequencies but as I have the r1 offset I just do it then replace by initial condition.)
kcx = [v for v in np.array(range(0,L1)) - L1/2 if v!=0] #L1, L2 odd are the image dimensions
kcx.append(L1/2)
krx = [v for v in np.array(range(0,L2)) - L2/2 if v!=0]
krx.append(L2/2)
kc, kr = np.meshgrid(kcx, krx)
laplacian = -((((2*np.pi)/L1)*kc)**2 + (((2*np.pi)/L2)*kr)**2)
QR = np.fft.fftshift(np.fft.fft2(qr)) #finite diff along rows
QC = np.fft.fftshift(np.fft.fft2(qc)) #finite diff along cols
W = np.fft.fftshift(np.fft.fft2(w))
divergence = ((2*np.pi*1j*kc)/L1)*QC+((2*np.pi*1j*kr)/L2)*QR
RS = -r2*divergence+r1*W
V = RS*(1/(r1-r2*laplacian)) #solves the equation
# recover v
v = np.real(np.fft.ifft2(np.fft.ifftshift(V)))
But the result is unsatisfactory.
For example if u is given by;

And I chose w = u, q = grad(u) (finite differences along rows and cols to get qc and qr).
It results in this;

I was expecting something closer to u... It is part of a bigger optimization algorithm and the final result seems corrupted by artifacts arising at this step, so I wondered If I am doing something wrong.
Thank you in advance for any help.
EDIT: I've noticed there is a scaling problem with the divergence term compared to the laplacian but I don't understand why theoretically.