How to tranform a perspective projection fron World coordinates to camera coordinates

942 Views Asked by At

Duncan Marsh in his book Applied Geometry for Computer Graphics and CAD defines the perspective projection in World Coordinates as: $$ M_{WC}=n^TV - ( n \cdot V )I_4 $$ where $V$ is the projection viewpoint and $n$ the projection viewplane $$ M_{WC} = \small\begin{bmatrix}-n_2v_2-n_3v_3-n_4v_4&n_1v_2&n_1v_3&n_1v_4\\n_2v_1&-n_1v_1-n_3v_3-n_4v_4&n_2v_3&n_2v_4\\n_3v_1&n_3v_2&-n_1v_1-n_2v_2-n_4v_4&n_3v_4\\n_4v_1&n_4v_2&n_4v_3&-n_1v_1-n_2v_2-n_3v_3\end{bmatrix} $$

Others authors define the same perspective projection in Camera Coordinates with $$ M_{CC} = \begin{bmatrix}1&0&0&0\\0&1&0&0\\0&0&1&f\\0&0&0&0\end{bmatrix} $$ where the projection viewpoint is origin and the projection viewplane is a plane parallel to the $x$-$y$ plane and intercept the $z$-axis in $-f$.

Both matrix a the same and $ M_{WC} = M_{CC}RT$ Where $R$ is a rotation that makes the view plane parallel to $x$-$y$ plane and $T$ is a translate the origin to the view point and $RT$ is the camera Extrinsic Matrix.

$$ T = \begin{bmatrix}1&0&0&0\\0&1&0&0\\0&0&1&0\\-v_1/v_4&-v_2/v_4&-v_3/v_4&1\end{bmatrix} $$

My problem is I can not figure $R$.

2

There are 2 best solutions below

0
On BEST ANSWER

I’m still not quite sure what it is you’re trying to do, but I’ll point out some errors in your question that are likely preventing you from accomplishing it. I’ll follow the old computer-graphics convention used by Marsh: points are represented by row vectors and transformations are applied by right-multiplying by a matrix.

First, the definition of the camera-coordinate projection matrix $M_{CC}$ is garbled. There are two conventions in use for the direction of the camera axis and location of the viewplane, and you seem to be mixing them. The one I prefer because it doesn’t change orientation on the viewplane is to have the camera pointing in the negative $z$ direction so that the viewplane is $z=f$ with $f\lt0$. We want the point $(x,y,z)$ in Cartesian coordinates to be projected to $\left(\frac{fx}z,\frac{fy}z,f\right)$. In homogeneous coordinates, this is $x:y:z:1\mapsto x:y:z:\frac zf$ so $$M_{CC}=\begin{bmatrix}1&0&0&0\\0&1&0&0\\0&0&1&\frac1f\\0&0&0&0\end{bmatrix}.$$ We can check this against the formula Marsh gives for the projection matrix, which holds in any Cartesian coordinate system. In camera coordinates, $\mathbf V=(0,0,0,1)$ and $\mathscr n=(0,0,1,-f)$, hence $$\mathscr n^T\mathbf V-(\mathbf V\cdot\mathscr n)\,I_4=\begin{bmatrix}f&0&0&0\\0&f&0&0\\0&0&f&1\\0&0&0&0\end{bmatrix}$$ which, since a homogeneous transformation matrix can be multiplied by a non-zero scalar without affecting the transformation it represents, is clearly equivalent to $M_{CC}$ above.

A more serious error is that the equality $M_{WC} = M_{CC}R\,T$ doesn’t hold: $M_{CC}$ wants its input in camera coordinates, but $M_{WC}$ wants world coordinates, so multiplying the same coordinate tuple by the two sides of this equation doesn’t make sense. Let the transformation from world to camera coordinates be given by the product $TR$, per convention. A point expressed in world coordinates must first be converted to camera coordinates before applying $M_{CC}$, so the correct equality is $$M_{WC}=TRM_{CC}R^{-1}T^{-1}=(TR)M_{CC}(TR)^{-1},$$ that is, the two projection matrices are related by a change of basis, as one might expect.

Here a few other things that might help with your matrix manipulations:

  • The inverse of a rotation is its transpose, and the inverse of a translation is just the same matrix with the deltas along the last row negated, so the camera-to-world transformation is easy to compute given the two world-to-camera matrices $T$ and $R$.
  • For arbitrary $\mathbf V$ and $\mathscr n$, the focal distance $f$ is just the distance from $\mathbf V$ to the view plane, namely $$f=-{\mathbf V\cdot\mathscr n\over\sqrt{n_1^2+n_2^2+n_3^2}}.$$
  • As you saw above, you might need to multiply matrices by a scalar factor to make them look alike.
  • The plane $\mathscr n$ is a covariant vector (covector), so doesn’t transform the same way as does a point: if $M$ is the matrix of a coordinate transformation (and so nonsingular), then the plane vector in the new coordinate system that corresponds to $\mathscr n$ is $\mathscr n(M^{-1})^T$, not $\mathscr nM$. I find it sometimes helps to represent $\mathscr n$ as a column vector to help keep this straight. Using that convention, the projection matrix formula becomes $$\mathscr n\mathbf V-\mathbf V\mathscr n\,I_4.$$
  • The rows of a transformation matrix are the images of the basis vectors, so instead of trying to compute angles for the rotation matrix, it might be easier to construct it by matching up coordinate axes in the two frames. The first three components of the viewplane vector $\mathscr n$ constitute a normal vector to the plane, which corresponds to the camera’s $z$-axis. You then have one degree of freedom left for specifying a rotation (the camera’s roll).
3
On

So you have an object whose coordinates are in some world system which the game may use for physics and general coordinate system. Now the camera has a position and three angles defining its orientation. See Euler angles. Essentially it is a $3 \times 3$ matrix. Now to translate (which isn't a linear translation in $\Bbb R^3$ you need an additional coordinate so a $4 \times 4$ matrix. The rotation is also a $4 \times 4$ with an additional coordinate which serves no purpose. If this confuses see homogenous coordinates. Then if rotation is $\mathbf R$ and translation $\mathbf T$ then to change to camera coordinates you need $\mathbf R \mathbf T$