3D Perspective projection

2.2k Views Asked by At

I have this following question to answer, however I am not sure how I should combine my calculation into one final answer.

Suppose the Centre of Projection in a viewing space is at an offset $(0, 0, -5)$ from $(0,0,0)$, and the view plane is the $UV$ plane containing $c$. Find the transform matrix for the perspective projection, and give the projected Word Coordinate point $(10,-20,-10)$ on the view plane.

So this is the transformation matrix: with $1/d=0.2$. ($1/5$) $$ \begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 1/d & 1 \end{pmatrix} $$ Because the view plane is placed at $z=0$ we use the following similar triangles: $$ x' = x/(z/d+1), \quad y' = y/(z/d+1), \quad z'=z/(z/d+1), \quad w=z/d $$ My question is, what values do I use for the $x$,$y$,$z$. If using the project world coordinates $x'$, $y'$, $z'$ will be undefined due to the $0$. Once I have the values for $x'$, $y'$, $z'$ do I multiple them by the transformation matrix? Thanks for any help

3

There are 3 best solutions below

2
On

We want to map $P = (x,y,z)^\top$ to $P'=(x',y',z')^\top$.

All rays go through $C = (0,0,-5)^\top = (0,0,-d)^\top$ and hit the plane $z = 0$.

3D view x-z-plane

(Large version here and here)

We have the line with intersection $$ (0,0,-d)^\top + t ((x, y, z)^\top - (0,0,-d)^\top) = (x', y', 0)^\top \iff \\ (tx,ty,t(z+d) - d)^\top = (x', y', 0)^\top $$

so we need $$ t(z+d) -d = 0 \iff t = d/(z+d) $$ This leads to \begin{align} P' &= (x', y', z')^\top \\ &= (x', y', 0)^\top \\ &= \left( \frac{d}{z + d} x, \frac{d}{z + d} y, \frac{d}{z + d} (z+d) - d \right)^\top \\ &= \left( \frac{d}{z + d} x, \frac{d}{z + d} y, 0 \right)^\top \quad (*) \end{align}

So far we are in agreement regarding $x'$ and $y'$. We have difference in $z'$, which should be $$ z' = \frac{1}{z/d + 1} (z+d) - d = \frac{d}{z + d} (z+d) - d = 0 $$ and $w'$ will be different as well, see below.

Using homogeneous coordinates we can write the transformation $(*)$ as $$ p' = T p \iff \\ \begin{pmatrix} x' \\ y' \\ z' \\ w' \end{pmatrix} = \begin{pmatrix} d & 0 & 0 & 0 \\ 0 & d & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & d \end{pmatrix} \begin{pmatrix} x \\ y \\ z \\ 1 \end{pmatrix} \quad (**) $$ we get a homogeneous image vector $$ p' = \left( d x, d y, 0, z + d \right)^\top $$ which can be normalized to $$ p' = \left( \frac{d}{z + d} x, \frac{d}{z + d} y, 0, 1 \right)^\top $$

Finally one can apply the above transformation $(**)$ to $p = ( 10, -20, -10, 1)^\top$.

This gives $p' = (50, -100, 0, -5)^\top$ which normalizes to $p' = (-10, 20, 0, 1)^\top$ or $P'=(-10,20,0)^\top$, where the $x'$ value agrees with the 2D image view shown above.

0
On

Let me try based on the theories described in the arXiv.org article Unified Framework of Elementary Geometric Transformation Representation.

A so-called perspective projection in this specific problem can be realized by a central projection defined in definition 3.2 (pages 8-9) and formulated in equation (3.1) or (see definitions and formulations in line No. 1 of Table 1 in page 9 of the arXiv.org article Unified Framework of Elementary Geometric Transformation Representation.

As we know from the problem, the projection center $C=(0,0,-5)$ and view plane is $xOy$ axis plane of the 3D world coordinate system. So for the central projection in homogeneous matrix form to be found, the homogeneous coordinate of its projection center $(s)=(0,0,-5,1)^T$ in column vector convention, and that of its interface plane $z=0$ which is written into $(\pi)=(0,0,1,0)^T$, are all readily available. So the homogeneous matrix of the desired central projection per definition 3.2 (pages 8-9) and equation (3.1) (or definition and formulation in line No. 1 of Table 1 in page 9) of the arXiv.org article Unified Framework of Elementary Geometric Transformation Representation can be obtained easily:

$$T=\left[ \begin{array}{cccc} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \end{array} \right] -\dfrac{\left[ \begin{array}{c} 0 \\ 0 \\ -5 \\ 1 \\ \end{array} \right]\cdot \left[ \begin{array}{cccc} 0 & 0 & 1 & 0 \\ \end{array} \right]}{\left[ \begin{array}{cccc} 0 & 0 & -5 & 1 \\ \end{array} \right]\cdot \left[ \begin{array}{c} 0 \\ 0 \\ 1 \\ 0 \\ \end{array} \right]}=\left[ \begin{array}{cccc} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & \dfrac{1}{5} & 1 \\ \end{array} \right]$$

It is easy to apply the obtained central projection to the given $(10,-20,-10)$ of which a column vector homogenous coordinate is : $$X= \left[ \begin{array}{c} 10 \\ -20 \\ -10 \\ 1 \\ \end{array} \right]$$

then

$$X'=\left[ \begin{array}{cccc} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & \dfrac{1}{5} & 1 \\ \end{array} \right]\cdot \left[ \begin{array}{c} 10 \\ -20 \\ -10 \\ 1 \\ \end{array} \right]=\left[ \begin{array}{c} 10 \\ -20 \\ 0 \\ \color{red}{-1} \\ \end{array} \right]\approx \color{red}{({\text{nonzero scalar } -1})} \left[ \begin{array}{c} -10 \\ 20 \\ 0 \\ \color{red}1 \\ \end{array} \right]$$ So the final projected point has the 3D Euclidean coordinate of $\color{red}{(-10,20,0)}$.

Similarly for your point $(10,-10,10)$ you will obtain $$\left[ \begin{array}{c} 10 \\ -10 \\ 0 \\ \color{red}3 \\ \end{array} \right]\approx \color{red}{({\text{nonzero scalar } 3})} \left[ \begin{array}{c} \dfrac{10}{3} \\ -\dfrac{10}{3} \\ 0 \\ \color{red}1 \\ \end{array} \right]$$

the projected 3D Euclidean coordinate of which should be $\left(\dfrac{10}{3}, -\dfrac{10}{3}, 0\right)$.

0
On

Why is it such a complex answer that involves all matrix in order to establish a 3D projection on a plane. The above answer(s) don't even relate to the camera position. Let's be more logical here. First, the camera tilt angle must be involved in the equations, since we know that tilting the camera downwards yields a bird's eye view, whereas tilting upwards yields a worm's eye view. And if it's set level it should give a level Perspective or so-called one point- or two point-perspective with the verticals being not converging. I have figured out these equations in 1981 prior to computers, when I patented the first mechanical apparatus that plots any 3D model or perspective projection by using Cartesian coordinate system. This figure below shows how easy is to define the house when the tilt angle (n) is ($0$) or ($-30$). If you need any more information let me know. Figured out prior to computers in 1979 and listed in part of my patent application as issued in England in 1981