Find rotation to make it look at a point in 3D space

167 Views Asked by At

I found a function that gets the rotation needed to make something look at a point. I don’t know how it works, though.

But, I do know how it would work in 2D. It would take arctangent of the point’s tangent, and then give me the angle in radians. However, I can’t figure it out in 3D space.

Can someone just give me some help to get me going in the right direction?

1

There are 1 best solutions below

1
On

Let $\vec{o}$ be the camera or observer, and $\vec{t}$ the target.

Furthermore, let $\vec{r}$ be a "right" direction vector. That is, when viewed from $\vec{o}$, point $\vec{t}+\vec{r}$ is directly right from $\vec{t}$. This sets the rotation of the view from $\vec{o}$ towards $\vec{t}$.

First, we define the unit direction vector from $\vec{o}$ to $\vec{t}$, $\hat{w}$: $$\hat{w} = \frac{\vec{t} - \vec{o}}{\left\lVert \vec{t} - \vec{o} \right\rVert} \tag{1a}\label{G1a}$$ Next, we use one step of the Gram-Schmidt process to orthonormalize the right direction vector with respect to $\hat{w}$, giving us a new unit direction vector $\hat{u}$ corresponding to "right". (Essentially, we remove the part of $\vec{r}$ that is along $\hat{w}$, and normalize the result to unit length, so that the resulting $\hat{u}$ will be perpendicular to $\hat{w}$.) $$\hat{u} = \frac{\vec{u}}{\lVert\vec{u}\rVert}, \quad \vec{u} = \vec{r} - \hat{w} \left( \hat{w} \cdot \vec{r} \right) \tag{1b}\label{G1b}$$ The third unit vector is perpendicular to both $\hat{u}$ and $\hat{w}$, and in a right-handed coordinate system is $$\hat{v} = \hat{w} \times \hat{u} \tag{1c}\label{G1c}$$ The rotation matrix $\mathbf{R}$ to rotate direction $(0,0,1)$ towards $\vec{t} - \vec{o}$; or in other words, the rotation matrix around $\vec{o}$, is $$\mathbf{R} = \left[\begin{matrix} \hat{u} & \hat{v} & \hat{w} \end{matrix}\right] = \left[\begin{matrix} u_x & v_x & w_x \\ u_y & v_y & w_y \\ u_z & v_z & w_z \end{matrix}\right] \tag{1d}\label{G1d}$$


In computer graphics, we typically want to do the opposite: we want to rotate and translate the view so that $\vec{o}$ is at origin, $\vec{t}$ is along the positive $z$ axis, and $\vec{t}+\vec{r}$ is on the positive $x$ axis.

To do this, we need the inverse rotation matrix $\mathbf{R}^{-1}$. However, because $\mathbf{R}$ is orthonormal (all its row vectors perpendicular to each other with length $1$, and all its column vectors perpendicular to each other with length $1$), its inverse is its transpose: $$\mathbf{R}^{-1} = \mathbf{R}^{T}$$ and the transform needed to transform point $\vec{p} = (x, y, z)$ thus to $\vec{p}^\prime = (x^\prime, y^\prime, z^\prime)$ is $$\vec{p}^\prime = \mathbf{R}^{-1} \left( \vec{p} - \vec{o} \right) = \mathbf{R}^{-1} \vec{p} - \mathbf{R}^{-1} \vec{o} \tag{2a}\label{G2a}$$ ie. $$\left[\begin{matrix} x^\prime \\ y^\prime \\ z^\prime \end{matrix}\right] = \left[\begin{matrix} u_x & u_y & u_z \\ v_x & v_y & v_z \\ w_x & w_y & w_z \end{matrix}\right] \left[\begin{matrix} x - o_x \\ y - o_y \\ z - o_z \end{matrix}\right] \tag{2b}\label{G2b}$$


If we extract the constant part from the rotation as the translation vector $\vec{T}$, $$\vec{T} = -\mathbf{R}^{-1} \vec{o} \tag{3a}\label{G3a}$$ we can write the entire transform as $$\vec{p}^\prime = \mathbf{R}^{-1} \vec{p} + \vec{T} \tag{3b}\label{G3b}$$ Using homogenous coordinates, this can be implemented as a single matrix-vector multiplication: $$\left[\begin{matrix} x^\prime \\ y^\prime \\ z^\prime \\ 1 \end{matrix}\right] = \left[\begin{matrix} u_x & u_y & u_z & T_x \\ v_x & v_y & v_z & T_y \\ w_x & w_y & w_z & T_z \\ 0 & 0 & 0 & 1 \end{matrix}\right] \left[\begin{matrix} x \\ y \\ z \\ 1 \end{matrix}\right] \tag{3c}\label{G3c}$$ This is very often used in computer graphics. Because the fourth component of these vectors is always $1$, only three components per homogenous vector need be stored – and those three match the standard Cartesian coordinate components. Similarly, usually only the $12$ components on the first three rows of transformation matrices are stored, because the fourth row is always constant.

Note that in terms of elementary operations needed – additions, subtractions, multiplications and divisions – $\eqref{G2a}$ is identical to $\eqref{G3a}$ and $\eqref{G3c}$.

For perspective projection, an additional division per Cartesian coordinate is needed, so that the not-stored fourth component will still be $1$. This is where the name "homogenous" comes from.