Pyramidal Histogram Of Oriented Gradients - Trilinear interpolation

468 Views Asked by At

Hello im struggling with an implementation of this article: https://goo.gl/8mpIuq

I performed bilinear interpolation over the histogram bins and the results are better with this interpolation, however on page 2 its also mentioned that a trilinear interpolation is added when the pyramid level reaches level 2. I have read this answer HOG trilinear interpolation and I completely understand the formula behind trilinear interpolation over 2x2 block sizes, but in this article we have a 3x3 block size and 7x7 on pyramid level 3, because these block sizes yield the best results.

The main point about trilinear interpolation is that each pixel in a cell contributes to its local cell by a weight which is defined as the position in each block. I don't know how to represent the location of a pixel in 3x3 block size or what kind of formula should i use.

Thank you for all your help!

EDIT: Another explanation with 2x2 block size http://pep.ijieee.org.in/journal_pdf/11-126-142960909718-22.pdf

1

There are 1 best solutions below

0
On

Hint Since the question is kind of fuzzily asked I won't claim this to be anything more than a hint to help the thinking going.

So my interpretation is that each pixel has two integer coordinates, each cell spans a range of coordinates in two dimensions. So two dimensions of the interpolation is x and y in the image and the third dimension is "angle". Say one pixel has gradient $g_p$, we then calculate the contributions to bins of neighbouring cells depending on all three of these distances $(\Delta_x, \Delta_y, \Delta_{\theta})$.

Let us say we have a local coordinate system for each cell, the mid point (origo) would naturally be the middle pixel if 3x3 pixels and 1 "step" away in a particular dimension should be the center pixel of the neighoburing cell. Now we can straight forwardly calculate coordinates for all the pixels in between for each coordinate system since we know how many pixels are in between two centers and that the coordinate system is linear and cartesian. In the 3x3 there would be two pixels inbetween of two middle-pixels, right? So then the coordinates $(0,\frac{1}{3},\frac{2}{3},1)$ for the left and $(-1,-\frac{2}{3},-\frac{1}{3},0)$ for the right.

Say we want to distribute the gradient of pixel 2. (1-1/3) of that will contribute to the left cells and (1-2/3) to the right cells. Since we have the multilinear property we can further subdivide on up and down and angle. How much to the up or down cells depend on the position of pixel 2 in the second dimension and so on.

Any further description than this would probably be dependent on data structures available and/or programming language to use I guess.