Finding the transformation matrix given points in space and their projections on a plane

358 Views Asked by At

Description of plane and points

Given a set of non-planar points ($\rm\color{green}{P_i}$) in $\color{green}{\text{Coordinate System 1}}$, the projections of those points ($\rm\color{blue}{S_i}$) on the $\color{blue}{\text{Shadow Plane}}$ in $\color{blue}{\text{Coordinate System 2}}$, and the distance ($\rm\color{blue}{d}$) between the $\color{red}{\text{Light Source}}$ and the $\color{blue}{\text{Shadow Plane}}$ given in the same scale as the $\color{blue}{\text{Shadow Plane}}$; how would one determine the positions of points ($\rm\color{green}{P_i}$) in $\color{blue}{\text{Coordinate System 2}}$? How many Point-Projection ($\rm\color{green}{P_i}$-$\rm\color{blue}{S_i}$) pairs would be necessary for a fully constrained solution?

Note: $\color{green}{\text{Coordinate System 1}}$ and $\color{blue}{\text{Coordinate System 2}}$ are rotated, translated, and scaled.

1

There are 1 best solutions below

0
On BEST ANSWER

This can be viewed as a variant of what’s known as the resectioning problem: recovering the $4\times3$ projection matrix $\mathtt P$ from a set of scene-image point correspondences. You can find more detail on how to do this in any standard reference such as Hartley and Zisserman’s Multiple View Geometry In Computer Vision. Once you have $\mathtt P$, you can recover the information that you need to construct “Coordinate System 2” from it. In particular, you have the following:

  • Writing $\mathtt P = \left[\mathtt M \mid \mathbf p_4\right]$, the third row $\mathbf m^3$ of the submatrix $\mathtt M$ is a vector parallel to the camera axis; $\det(\mathtt M)\mathbf m^3$ points toward the front of the camera. This gives you the direction of the Coordinate System 2 $z$-axis. Since you know the camera’s location (which you could also recover from $\mathtt P$) and the distance to the image (shadow) plane, you now also know the image plane. You could instead displace the principal plane $\mathbf P^3$ by the known distance to the image plane, but you still have to determine the correct direction in which to displace it.
  • The first and second rows of $\mathtt P$, $\mathbf P^1$ and $\mathbf P^2$, are the planes that map to the $y$- and $x$-axes, respectively, in the image. The $x$- and $y$-axes of Coordinate System 2 are therefore the intersections of these planes with the image plane. The origin of this coordinate system is of course the intersection of the three planes. The directional ambiguity is resolved by examining the scene-image point correspondences or back-mapping $(1,0,1)$ or $(0,1,1)$, which is a more expensive computation.

If an image point $\mathbf x = (x,y,w)^T$ corresponds to the scene point $\mathbf X$, then the following relation holds: $$\begin{bmatrix}\mathbf 0^T & -w\mathbf X^T & y\mathbf X^T \\ w\mathbf X^T & 0 & -x\mathbf X^T \\ -y\mathbf X^T & w\mathbf X^T & 0\end{bmatrix} \mathtt P = \mathbf 0.$$ This is a set of homogeneous linear equations in the elements of $\mathtt P$. Each point correspondence contributes two independent equations and $\mathtt P$ has 11 degrees of freedom, so in general one needs $5\frac12$ point correspondences to determine $\mathtt P$ uniquely. This can be reduced for your problem since you have other constraints on $\mathtt P$, namely the known camera center $\mathbf C$, which satisfies $\mathtt P\mathbf C=\mathbf 0$ and so is effectively another point correspondence, and the distance to the image plane. There are, of course, degenerate configurations for which the reconstruction is ambiguous, for example, if the camera and scene points lie on a twisted cubic or on the union of the image plane and camera axis. Hartley and Zisserman discuss these degenerate configurations in more detail.