8 degrees of freedom in homography

6.2k Views Asked by At

I am working through the math behind homographies, but my math skills are a bit rusty.

A homography can be calculated with 8 corresponding points (4-4) because the homography matrix has 8 degrees of freedom. This is because, eventhough the 3x3 matrix has 9 variables, one can "normalized to one" the cited explanation says:

Note that we can multiply all $h_{ij}$ by nonzero k without changing the equations

and the following explanation is given in "multiple view geometry in computer vision":

Note that the matrix H occurring in this equation may be changed by multiplication by an arbitrary non-zero scale factor without altering the projective transformation. Consequently we say that H is a homogeneous matrix, since as in the homogeneous representation of a point, only the ratio of the matrix elements is significant. There are eight independent ratios amongst the nine elements of H, and it follows that a projective transformation has eight degrees of freedom.

I have also read this post: degree of freedom of Homography matrix, but im afraid i still don't understand.

Can someone explain in detail how this normalization works? Or why the scaling multiplication normalizes the matrix?

thanks a ton!

2

There are 2 best solutions below

1
On BEST ANSWER

The following discussion may be an explanation, but if it is not, please try to ask where there is something hard to catch.

My strategy is always to simplify things, but keep the problem related (and slightly simpler, but essentially the same). Here, i propose to understand homographies from a (projective) line to an other (projective) line, since planes are too complicated.

For short, a point $P$ on the affine line $L$ is given by its coordinate, "$x$", say. But we consider also the infinite point, so we have to consider more "complicated point( representation)s". Instead of $x$ we write $[x:1]$, in words, $x$ divided by one. And the point at infinity is $[1:0]$. We can easily "see" these points if we also know the second dimension as follows. Instead of $[x:1]$ we draw in the plane the point $(x,1)$. This is all. In this plane there are also many other points, but from the point of view of the "camera placed in $(0,0)$", we can not distinguish two points on the same line, more exactly, on the same ray. For instance, the points $(2,1)$, and $(4,2)$, and $(6,3)$, and ... map in the projective line to the same point $[2:1]=[4:2]=[6:3]=\dots$ and we will always want this last view.

Why do we like this representation, and also the representation $[1:0]$ for the "point at infinity"? Because we can take also an other way to view things. Imagine a radar, a sonar, a camera placed in the middle $O$ of a 2D beach, and the sea starts in some $20$ meters, say, in front of us, in a point $A=[0:1]=0$. The camera should not be able to recognize the depth. Then a full turn of the camera is something that can be easily imagined, and in this full turn, we cover the line of the breakers, of the last bastion of sand, from $A$ to the right. At some point, after some seconds, we pass through the point of view "opposite" to $A$, it has no coordinate $x$. And in the next ms we are coming from the left to $A$.


Now imagine there are some $7$ surfers playing in the breakers. At coordinates $1,3,7,8,9,21,100$. An other camera sees the first three surfers in $-3,5,1$. Where should we place the other surfers?


This is a similar question to the one in the OP. We need a matrix transformation, $$ \begin{aligned} \begin{bmatrix} x\\ 1 \end{bmatrix} &\to \begin{bmatrix} x'\\ 1 \end{bmatrix} := \begin{bmatrix} a&b \\ c&d \end{bmatrix} \begin{bmatrix} x \\ 1 \end{bmatrix} \ ,\qquad\text{ so } \\ x&\to x'= \frac{ax+b}{cx+d}\ . \end{aligned} $$ Notice the above simpler form for the transformation.

It is clear that the simpler form for the homographic transformation $x\to x'= \frac{ax+b}{cx+d} $ is homogenous in $(a,b,c,d)$. Multiplying them in the same time by something $\ne 0$ would lead to a transformation, which is the same one.

Now we are searching for a specific homography. There are too many (redundant) variables in $$ \begin{bmatrix} a&b\\c&d \end{bmatrix} $$ We are free to make one choice, one norming. Let us say, i would like to norm $d=1$. This means, we replace the above by $$ \frac 1d \begin{bmatrix} a&b\\c&d \end{bmatrix} = \begin{bmatrix} a/d&b/d\\c/d&d/d \end{bmatrix} = \begin{bmatrix} a/d&b/d\\c/d&1 \end{bmatrix}\ . $$ (The one right lower corner entry is now one.)


Now we have two cameras. One point gives a boring situation, but if we take some seven points, say, and try to get the right camera transform, things are slightly more complicated. Let us take the (random) points mentioned above. We try to norm $d=1$, thus obtaining a system in the other three variables $a,b,c$. Conditions:

  • $1\to(a\cdot1+b)/(c\cdot 1+1)=-3$,
  • $3\to(a\cdot3+b)/(c\cdot 3+1)=5$,
  • $7\to(a\cdot7+b)/(c\cdot 7+1)=1$.

We solve this system getting exactly (with sage, to have a short end):

sage: var('a,b,c');
sage: def T(x): return (a*x+b)/(c*x+1)
sage: solve( [T(1) == -3, T(3) == 5, T(7) == 1], [a,b,c], solution_dict=True )
[{c: -5/11, b: -17/11, a: -1/11}]

So the solution is: $$ T= \begin{bmatrix} a&b\\c&1 \end{bmatrix} = \begin{bmatrix} -1/11&-17/11\\-5/11 &1 \end{bmatrix} \ . $$ Humanly we would multiply with $-11$, obtaining an other matrix giving the same homographic transformation: $$ U = \begin{bmatrix} 1&17\\5 &-11 \end{bmatrix} \ . $$ Passing from $U$ to $T$ is this step of norming. (By chance, we have now a normed entry in the $a$--place.)

We can (if we can) also try to norm the entry denoted above all the time by $b$, the matrix $V$ obtained implements the same homographic transformation: $$ V = \frac 1{17}U = \begin{bmatrix} 1/17&1\\5/17 &-11/17 \end{bmatrix} \ . $$ I think i should stop here, a pointed question would be now simpler to answer.

0
On

Suppose that $\mathbf y=H\mathbf x$. Then by elementary properties of matrix multiplication, $(\lambda H)\mathbf x = \lambda(H\mathbf x)=\lambda\mathbf y$, so when $\lambda\ne0$, $H\mathbf y$ and $(\lambda H)\mathbf y$ represent the same point. To put it another way, any homogeneous transformation matrix, not only one that represents a homography, is uniquely determined up to an irrelevant scalar factor.

Note, by the way, that the text only claims that some element of the matrix can be normalized to $1$, not that any specific element can. In general, zeros can appear anywhere within a transformation matrix, but if the matrix is nonzero, there must be at least one nonzero element.