How to minimize sum of matrix-convolutions?

209 Views Asked by At

Given $A$, what should be B so that

$\lVert I \circledast A - I \circledast B \rVert _2$

is minimal for any $I$?

  • $I \in \mathbb{R}^{20x20}, A \in \mathbb{R}^{5x5}, B \in \mathbb{R}^{3x3}. $ Note that $B$ is smaller than $A$.
  • I $\circledast$ K is a convolution on $I$ with kernel $K$. The result is padded with zeros as to match the shape of input $I$, this means that $(I \circledast K ) \in \mathbb{R}^{20x20}$.

Does a closed form exist? Do I need to use a Fourier transform? As an extension, how does this work when $A,B$ are not square, and can be of arbitrary size (instead of the special case here where $A$ is larger than $B$?

1

There are 1 best solutions below

0
On BEST ANSWER

I will start with the case of one dimension first. Convolution is linear $$z=I*B-I*A=I*(B-A)$$ Expand to definition of convolution $$z_n = \sum_{n'}I_{n-n'}(B_{n'}-A_{n'})$$ Square of norm is $$\|I*B-I*A\|^2=\sum_nz_n^2$$ To find the minimum (proof that this yields minimum is shown later), do partial differentiation on each element of $B$ $${\partial\over\partial B_p}\sum_nz_n^2=2\sum_nz_n{\partial z_n\over\partial B_p} = 0 \tag{1}\label{eq1}$$ where $m$ is valid index of $B$. Partial differentiation of convolution is $${\partial z_n\over\partial B_p}=I_{n-p} \tag{2}\label{eq2}$$ therefore $\eqref{eq1}$ is $$\sum_nz_n{\partial z_n\over\partial B_p} = \sum_nI_{n-p}z_n = \sum_n\sum_{n'}I_{n-p}I_{n-n'}(B_{n'}-A_{n'})=0$$ giving us the following relation $$\sum_n\sum_{n'\in B}I_{n-p}I_{n-n'}B_{n'} = \sum_n\sum_{n'\in A}I_{n-p}I_{n-n'}A_{n'}\tag{3}\label{eq3}$$ This is a linear system of equations that can be solved using matrix methods. Note that the notation $n'\in A$ is just a notation to mean the indices where $A$ is defined.

With change of variable $$n=k+p$$ we make things more readable, and find that the coefficient is auto-correlation of $I$. $$\begin{aligned}\sum_n\sum_{n'}I_{n-p}I_{n-n'}A_{n'} &=\sum_p\sum_{n'}I_kI_{k+p-n'}A_{n'} \\ &= \sum_{n'}(I\star I)_{p-n'}A_{n'}\end{aligned}$$ using this on both sides of $\eqref{eq1}$, we get $$\sum_{n'\in B}(I\star I)_{p-n'}B_{n'}=\sum_{n'\in A}(I\star I)_{p-n'}A_{n'}$$ RHS consists of only constants, and autocorrelation of real values are symmetric about index 0. Therefore $$\boxed{\sum_{n'\in B}(I\star I)_{p-n'}B_{n'}=[(I\star I)*A]_p}$$ There are as many $p$ as there are $n'$ on the LHS, and also due to the symmetry of $I\star I$, the coeffiecients form a symmetric circulant matrix. Solve this system to get the optimal $B$.

Unfortunately, the above also says that optimal $B$ depends on $I$ as well. Also, as I've mentioned in the comment, if you want the centers of $A$ and $B$ to match, you have to pad $B$ accordingly on both sides.

Proof that this minimizes $\|z\|$

From $\eqref{eq2}$ we see that second order derivative does not exist. Hessian of $\|z\|$ is then $$\begin{aligned}\mathrm{H}_{pq}&={\partial^2\over\partial B_p\partial B_q}\sum_nz_n^2=2{\partial\over\partial B_q}\sum_nz_n{\partial z_n\over\partial B_p}\\ &= 2{\partial\over\partial B_q}\sum_nz_nI_{n-p} \\ &= 2\sum_nI_{n-p}I_{n-q} \end{aligned}$$ $I_{n-p}$ is an element of a circulant matrix $C$. The Hessian can be rewritten as $$\mathrm{H} = 2C^TC$$ so the second derivative test is $$\det(\mathrm{H}) = 2\det(C^TC)=2\det(C^T)\det(C)=2[\det(C)]^2\geq 0$$ The test is inconclusive when $\det(C)=0$, which happens when you have a flat image.

2D version

The line of reasoning is the same as the 1D version. For convolution $$z_{mn} = \sum_{m'n'}I_{(m-m')(n-n')}(B_{m'n'}-A_{m'n'})$$ minimizing the norm of the above is equivalent to solving the linear equation $$\boxed{\sum_{m'n'\in B}(I\star I)_{(p-m')(q-n')}B_{m'n'}=[(I\star I)*A]_{p'q'}}$$ The matrix on the LHS is no longer a symmetric circulant matrix. But if you index $B$ row by row (or maybe column by column, I've not tried that) you instead get a symmetric block Toeplitz matrix.