Maximum/minimum values for two-dimensional type-II discrete cosine transform matrix

390 Views Asked by At

When encoding a JPEG image, the pixels are encoded as an 8x8 matrix of values in the range [-128...127]. A two dimensional type-II DCT is applied to the matrix and the result is compressed further.

Is there a way to calculate the maximum and minimum output values for any given element of the the matrix? I need to analyse a matrix and determine if it is possibly the result of a DCT transform. If a value of a given element is 'out of bounds' then I can discard the data as it cannot possibly be output of a valid DCT transform.

Probably the best solution would be to perform a reverse DCT and see if any values are outside the range [-128...127] but most of the time I will not have the value of element at [0,0] available.

Any ideas?

1

There are 1 best solutions below

0
On

A classical DCT-II transformation writes:

$$ D[u,v] = \frac{1}{4} C_u C_v \sum_{y=0}^7 \sum_{x=0}^7 I[x,y] \cos\frac{(2x+1)u\pi}{16} \cos\frac{(2y+1)v\pi}{16} $$

I will leave the outer constant $\frac{1}{4} C_u C_v$ aside for the reasoning, and write the inner kernel:

$$C[u,v]= \cos\frac{(2x+1)u\pi}{16} \cos\frac{(2y+1)v\pi}{16}.$$

As a consequence, the summation rewrites:

$$\sum_{y=0}^7 \sum_{x=0}^7 I[x,y]C[u,v].$$

You can find an upper bound by maximizing each product, choosing $I[x,y]$ carefully: if $C[u,v]>0$, pick $I[x,y]=127$, otherwise set $I[x,y]=-128$ (hence each $I[x,y]C[u,v]$ product is maximally positive). Similarly, a lower bound is given for $I[x,y]=-128$ when $C[u,v]>0$ and $I[x,y]=127$ when $C[u,v]\le 0$.

Thus, you just have to produce a matrix of the signs of $C[u,v]$, and turn the +1 into 127, and the -1 into -128, and compute both sums of products.

Plugin in the outer constant $\frac{1}{4} C_u C_v$ will give you possible lower and upper bounds.

Two last thoughts: first, your input matrix might be integer-valued, hence you can add some rounding to the above values. Second, the input values are signed integers in the $[-128,127]$ range, which is not the most usual. Pixel values have often in $[0,255]$. I thus suspect that an offset could be consider as well.