Combining two convolution kernels

12k Views Asked by At

Is it possible to combine two convolution kernels (convolution in terms of image processing, so it's actually a correlation) into one, so that covnolving the image with the new kernel gives the same output as convolving it with the first, and then the second kernel?

For example, if I want to convolve an image with 1st 3x3 kernel and then 2nd 3x3 kernel, apparently I should be able to combine those two kernels and convolve my image with this new kernel. What is the size of my new kernel? Is it 3x3? If so, there's something worrying me about that. Let me explain.

Let's say I'm convolving Input image with Kernel 1, then convolving the result with Kernel 2.

Pixel marked X depends on value of pixel P2 (convolving the image with kernel 2). Pixel P2 lies on the new image that was produced by convolving input image with Kernel 1. So the value of P2 was calculated with taking P1 into account.

Now, X relies on values of P1 and P2. That's what stops me from believing I could achieve the same result with convolving my original image with a 3x3 kernel combining kernel 1 and kernel 2. Pixel X cannot "know" anything about pixel at location P1.

4

There are 4 best solutions below

2
On

Let's assume the original image is $x[n_1,n_2]$ and two kernels are $h_1[n_1,n_2]$ and $h_2[n_1,n_2]$. Convolution associative law says that:

$$(x[n_1,n_2]*h_1[n_1,n_2])*h_2[n_1,n_2] = x[n_1,n_2]*(h_1[n_1,n_2]*h_2[n_1,n_2])$$

So covnolving the image with the new kernel will always produce the same output as convolving it with the first, and then the second kernel. To get the new kernel just use the 2-dimensional convolution definition:

$$h_3[n_1,n_2]=h_1[n_1,n_2]*h_2[n_1,n_2]=\sum_{i=-\infty}^{+\infty} \sum_{j=-\infty}^{+\infty} h_1[i,j] h_2[n_1-i,n_2-j] $$

6
On

The answer is as simple as this: convolution is associative:

$$ a \ast (b\ast c) = (a \ast b ) \ast c. $$

3
On

The answer is yes.

The kernel you would use is simply these operations combined. So blur:

1,1,1
1,1,1
1,1,1

combined with blur:

1,2,3,2,1
2,4,6,4,2
3,6,9,6,3
2,4,6,4,2
1,2,3,2,1

It is assume all points not used has a zero. And in each of those 9 spaces you do another copy of the kernel. And add up the different parts needed by a multiple in that section.

The reason you don't run into is that it's slower. You're doing MN work times K1, and MN work times K2. But, MN * K1 * K2 > MN * K1 + MN * K2. So the typically desire is to separate the kernels. You can do this same 5x5 blur (25 operations per pixel) as two [1,1,1] kernels both horizontal and vertical, so (12 operations per pixel).

In straight math, you won't have lossy operations. So the order of the convolution can matter if you have lossy operations. This is why doing a blur and then an emboss is different than emboss then blur. Which isn't to say the kernels aren't associative, but that the error in them can be. But, for blurs you won't have this edge condition as strongly.

You will however lose the rounding bias and fail to compound it. value / 81 rather than (value / 9) / 9 (in floored integer division) could make your resulting matrix slightly more correct than it would have otherwise been.

1
On

Let's formalize OP's question:

  1. Is there an equivalent kernel to two-times 3*3 kernel cross-relation operations applied on the same input (i.e image in this example)?
  2. How to deduce the size of the equivalent kernel?

The answer to the first question is YES. The reason shall be clearly found in previous answers given the associative property in context of convolution operations regardless of continuous or discrete objects.

The answer to the second question is still obscure thus needs some clarification.

Assume the first and second kernels (i.e. $K_1, K_2$) have size of $(n,n)$ and $(m,m)$ respectively, the input image $X$ has a shape of $(h,w)$. After applying the first kernel, namely $Y_1=X*K_1$, the output $Y_1$ has a shape of $(h-n+1, w-n+1)$. Again, after $Y_2=X*K_1*K_2=Y_1*K_2$, the output $Y_2$ has a shape of $$(h-n+1-m+1, w-n+1-m+1) \tag{1}\label{Eq1}$$ Since there exists an equivalent kernel $K_t$ that has a shape of $(t,t)$ and satisfies $X*K_t=X*K_1*K_2$, the shape of $Y_2$ can also be written as $$(h-t+1, w-t+1) \tag{2}\label{Eq2}$$ Combining \eqref{Eq1} and \eqref{Eq2}, we have $$t=m+n-1$$

Let $m=n=3$, then we can compute the equivalent kernel's shape as 5*5.