Locate a pinhole camera using a fiducial marker

407 Views Asked by At

enter image description here

Note: The superscript notation used refers to the frame of reference. There are three frames of reference:

  1. $w$, the world frame (in Euclidean 2-space),
  2. $c$, the camera frame (in Euclidean 2-space), and
  3. $i$, the image frame (in pixels).

Suppose we have:

  • a 1-dimensional "tag", or "fiducial marker", $t$, defined by its boundaries, $t:=((0,-1)^w,(0,1)^w)$, existing at between $(0,-1)^w$ and $(0,1)^w$,
  • a "camera" at $c^w∈ℝ^2$, focal length $f^i>0$, image plane with width $w^i>0$ (whose center, $d^i:=w^i/2$, is $f^i$ units from $c$).

The optical axis is the line that includes the points $c$, $d$. Let's define $q$ as the point where the optical axis intersects the $y$ axis.

Define $b^i$ and $c^i$ as the tag's boundaries projected onto the image plane; that is, they are scalars drawn from $[0,w]$ which give some distance along the image plane, such that:

  • $b^i$ is the point intersecting the image plane of the line from $(0,1)^w$ to $c^w$ and
  • $c^i$ is the point intersecting the image plane of the line from $(0,-1)^w$ to $c^w$.

Let's assume the tag is in view of the camera, that is, assume that $b^i,c^i∈[0,w]$.

Finally, let $R$ be a $2 \times 2$ rotation matrix such that the line with points $R[0,1]$, $R[0,-1]$ is perpendicular to the optical axis.

Note: $f^i$, $w^i$, $b^i$, $c^i$, and $d^i$ are expressed in pixels, not necessarily in the same unit scale as the x-y plane of the world frame and camera frame.

Problem: Given $f^i>0$, $w^i>0$, $b^i∈[0,w]$, $c^i∈[0,w]$, and $R∈ℝ^{2 \times 2}$, find:

  1. Camera position $c^w∈ℝ^2$ and
  2. A function mapping any point in world space into the camera image plane: $$Ω:ℝ^2 → [0,w] : p^w → p^i$$

Bonus points:

  1. Generalize to any tag position $t$.
  2. Generalize to $n$ dimensions.
3

There are 3 best solutions below

3
On

First, it's convenient to refer to have a one-dimensional frame of reference originating from the center of the image plane $x^c = f$, rather than from the edge of it. So define $d^i:=w^i/2$ and for all $p^i∈ℝ$, define $p_0^i:=p^i-d^i$ and $Ω_0(p^w):=Ω(p^w)-d^i$.

From another proof from Moving a pinhole camera we know that for some $p^w$ in world coordinates,

$Ω_0(p^w) = \dfrac{(0, 1) f R^T (p^w - c^w)} {(1, 0) R^T (p^w - c^w)}$.

Now that we have solved the second half of the problem, let's go back and solve the first half, that is, let's find $c^w$.

Let $R = \begin{bmatrix} i & j\\ k & l \end{bmatrix}$. Since $R$ is a rotation matrix, $R^{-1}=R^T= \begin{bmatrix} i & k\\ j & l \end{bmatrix}$.

Plugging in $Ω_0((0,1)) = b_0^i$ and $Ω_0((0,-1)) = c_0^i$, we can solve for $c^w$:

  • $b_0^i/f = \dfrac{jp_x^w-jc_x^w+lp_y^w-lc_y^w} {ip_x^w-ic_x^w+kp_y^w-kc_y^w}$ for $p^w = (0,1)$, and
  • $c_0^i/f = \dfrac{jp_x^w-jc_x^w+lp_y^w-lc_y^w} {ip_x^w-ic_x^w+kp_y^w-kc_y^w}$ for $p^w = (0,-1)$

So,

  • $b_0^i/f = \dfrac{-jc_x^w+l-lc_y^w} {-ic_x^w+k-kc_y^w}$
  • $c_0^i/f = \dfrac{-jc_x^w-l-lc_y^w} {-ic_x^w-k-kc_y^w}$

So,

  • $b_0^i * (-ic_x^w+k-kc_y^w) = f*(-jc_x^w+l-lc_y^w)$
  • $c_0^i * (-ic_x^w-k-kc_y^w) = f*(-jc_x^w-l-lc_y^w)$

So,

  • $-ib_0^ic_x^w+kb_0^i-kb_0^ic_y^w+jfc_x^w-lf+lfc_y^w=0$
  • $-ic_0^ic_x^w-kc_0^i-kc_0^ic_y^w+jfc_x^w+lf+lfc_y^w=0$

So,

  • $c_x^w(jf-ib_0^i)+c_y^w(lf-kb_0^i)+(kb_0^i-lf)=0$
  • $c_x^w(jf-ic_0^i)+c_y^w(lf+kc_0^i)+(kc_0^i+lf)=0$

This is a system of linear equations which can be solved for $c_x^w$ and $c_y^w$ by standard methods.

QED.

Bonus:

  1. Using the above method we can plug in to $Ω_0$ arbitrary points bounding the tag which project to $b$ and $c$, respectively.
  2. For $n>2$ dimensions, the projection and tag are finite subsets of two $n-1$-dimensional affine hyperplanes, and $n$ points are required to define the tag which gives us $n$ equations to solve for the $n$ dimensions of the pinhole camera location, which yields a single solution if the setup is nondegenerate.
0
On

Bonus Question 2

The setting of homogeneous coordinates admits an elegant solution in $n$ dimensions.

Define the homogeneous transformation as $$h: \mathbb{R}^n \rightarrow \mathbb{R}^{n+1}:\begin{bmatrix} x_{1} \\ x_{2} \\ \vdots \\ x_{n} \end{bmatrix}\mapsto\begin{bmatrix} x_{1} \\ x_{2} \\ \vdots \\ x_{n} \\ 1 \end{bmatrix},$$

and its inverse $$h^{-1}: \mathbb{R}^{n+1} \rightarrow \mathbb{R}^{n}\cup\{\infty\}: \begin{bmatrix} x_{1} \\ x_{2} \\ \vdots \\ x_{n} \\ x_{n+1} \end{bmatrix} \mapsto \begin{cases} \begin{bmatrix} x_{1} / x_{n+1} \\ x_{2} / x_{n+1} \\ \vdots \\ x_{n} / x_{n+1} \\ \end{bmatrix}, & \text{if}\ x_{n+1} \neq 0 \\ \infty, & \text{if}\ x_{n+1} = 0 \end{cases} $$

We can now generalize the result from above as $$\begin{align}\Omega_0(\vec{p^i})&=h^{-1} \circ K_0*E*h \circ \vec{p^w} \\ &= h^{-1} \circ \left[ \begin{array}{c|c|c} fI & \vec{0} & \vec{0} \\ \vec{0}^T & 1 & 0 \end{array} \right] * \left[ \begin{array}{c|c} R^T & \vec{0} \\ \vec{0}^T & 1 \end{array} \right] * \left[ \begin{array}{c|c} I & \vec{-c^w} \\ \vec{0}^T & 1 \end{array} \right] * h \circ \vec{p^w} ,\end{align}$$

where:

  • intrinsic matrix $K_0 := \left[ \begin{array}{c|c|c} fI & \vec{0} & \vec{0} \\ \vec{0}^T & 1 & 0 \end{array} \right]$ and
  • extrinsic matrix $E := \left[ \begin{array}{c|c} R^T & \vec{0} \\ \vec{0}^T & 1 \end{array} \right] * \left[ \begin{array}{c|c} I & \vec{-c^w} \\ \vec{0}^T & 1 \end{array} \right].$

Note that the final (leftmost) step above, $h^{-1}$, is actually mapping the dimensions from $n$ to $(n-1)$, since the intrinsic matrix "projected" the vector from $(n+1)$-dimensional homogeneous coordinates to $n$-dimensional homogeneous coordinates.

In fact we can even show coordinates from the corner of the image rather than from its intersection with the objective line:

  • Let $\vec{\beta}$ be an $(n-1)$-dimensional vector giving the image dimensions (in the 2-d setting our image was just a 1-dimensional line segment of length $w$).
  • Now $\Omega(\vec{p^w}) = \Omega_0(\vec{p^w})+ \vec{\beta}/2$, that is, we just need to apply a translation $ T_0 = \left[ \begin{array}{c|c} I & \vec{\beta}/2 \\ \vec{0}^T & 1 \end{array} \right],$ so $$ \begin{align} \Omega(\vec{p^w}) &= \Omega_0(\vec{p^w}) + \vec{\beta}/2 \\ &= h^{-1} \circ T_0 * h \circ \Omega_0(\vec{p^w}) \\ &= h^{−1} \circ T_0 * h \circ h^{-1} \circ K_0 * E * h \circ \vec{p^w} \\ &=h^{−1} \circ T_0 * K_0 * E * h \circ \vec{p^w}\\ &=h^{-1} \circ \left[ \begin{array}{c|c} I & \vec{\beta}/2 \\ \vec{0}^T & 1 \end{array} \right] \left[ \begin{array}{c|c|c} fI & \vec{0} & \vec{0} \\ \vec{0}^T & 1 & 0 \end{array} \right] * \left[ \begin{array}{c|c} R^T & \vec{0} \\ \vec{0}^T & 1 \end{array} \right] * \left[ \begin{array}{c|c} I & \vec{-c^w} \\ \vec{0}^T & 1 \end{array} \right] * h \circ \vec{t_j^w}\\ &=h^{-1} \circ \left[ \begin{array}{c|c|c} fI & \vec{\beta}/2 & \vec{0} \\ \vec{0}^T & 1 & 0 \end{array} \right] * \left[ \begin{array}{c|c} R^T & R^T*(\vec{-c^w}) \\ \vec{0}^T & 1 \end{array} \right] * h \circ \vec{t_j^w} \end{align} \hspace{15pt}(*),$$

where $I$ is either the $(n-1)$- or $n$-dimensional identity matrix, depending on the context.

Bonus Question 1

Let known tag boundaries be $\vec{t_j^w} \in \mathbb{R}^n$ for $j \in {1,...,n}$, which are the boundaries of some hypercube which can be expressed as $\vec{t_j^w} = \lambda\vec{u_j^w}+\vec{v_j^w}$ where $\{\vec{u_j^w}\}$ are the boundaries of the unit hypercube of dimension $n$, and $\lambda>0$ and $\vec{v^w}\in \mathbb{R}^n$ is some translation vector away from the origin. Note that all tag boundaries must lie on the same hyperplane:

$$\begin{vmatrix} \vec{t_0^w}^T \\ \vdots \\ \vec{t_n^w}^T \\ \vec{1}^T \notag \end{vmatrix}=0.$$

For each $j \in {1,...,n}$, we have observed that the projected position $\vec{t_j^w} \in \mathbb{R}^n$ on the image plane is $\vec{t_j^i} \in \mathbb{R}^{n-1}$.

So we must solve the following system of equations for $\vec{c^w} \in \mathbb{R}^n$:

$$ \Omega(\vec{t_0^w})=\vec{t_0^i}\\ \vdots \\ \Omega(\vec{t_n^w})=\vec{t_n^i},$$

that is, we must solve $(*)$ for $\vec{c^w} \in \mathbb{R}^n$ in this system of $n \times (n-1)$ equations from $j \in{1,...,n}$.

(Note that this many equations would seem to be an overdetermined system. It's possible this is because the system should be solved for the rotation as well as the translation. See below for how this seems to work for the 3D case. In the 2D case, we must solve for the two translation dimensions $c_x^w, c_y^w$ and one rotation angle $\theta$, but we only have two equations - so I'm not sure if this system will solve the pose estimation problem generally.)

In the 2D case $(*)$ is: $$\begin{align} t_j^i&=h^{-1} \circ \left[ \begin{array}{ccc} f & w/2 & 0 \\ 0 & 1 & 0 \end{array} \right] * \left[ \begin{array}{ccc} \cos{\theta} & \sin{\theta}&-c_x^w\cos{\theta} -c_y^w\sin{\theta} \\ -\sin{\theta} & \cos{\theta}&c_x^w\sin{\theta} -c_y^w\cos{\theta} \\ 0 & 0 & 1 \end{array} \right] * h \circ \vec{t_j^w}\\ &=h^{-1} \circ \left[ \begin{array}{ccc} f\cos{\theta} + (w/2)\sin{\theta} & f\sin{\theta} + (w/2)\cos{\theta} & \alpha \\ -\sin{\theta} & \cos{\theta} & c_x^w\sin{\theta}-c_y^w\cos{\theta} \end{array} \right] * \left[ \begin{array}{c} {t_j^w}_x \\ {t_j^w}_y \\ 1 \end{array} \right]\\ &=h^{-1} \circ \left[ \begin{array}{ccc} f{t_j^w}_x\cos{\theta} + (w/2){t_j^w}_x\sin{\theta} + f{t_j^w}_y\sin{\theta} + (w/2){t_j^w}_y\cos{\theta} + \alpha \\ -{t_j^w}_x\sin{\theta} + {t_j^w}_y\cos{\theta} + c_x^w\sin{\theta}-c_y^w\cos{\theta} \end{array} \right],\end{align}$$

where $\alpha := -fc_x^w\cos{\theta}-fc_y^w\sin{\theta}+(w/2)c_x^w\sin{\theta}-(w/2)c_y^w\sin{\theta}$

So,

$$-{t_j^w}_xt_j^i\sin{\theta} + {t_j^w}_yt_j^i \cos{\theta}+ c_x^wt_j^i\sin{\theta}-c_y^wt_j^i\cos{\theta}\\ = f{t_j^w}_x\cos{\theta} + (w/2){t_j^w}_x\sin{\theta} + f{t_j^w}_y\sin{\theta} + (w/2){t_j^w}_y\cos{\theta} -fc_x^w\cos{\theta}-fc_y^w\sin{\theta}+(w/2)c_x^w\sin{\theta}-(w/2)c_y^w\sin{\theta}$$

Collecting like terms, and assume ${t_j^i}_x=0$,

$$Ac_x^w+Bc_y^w+C=0,$$

where:

  • $A:= t_j^i\sin{\theta} + f\cos{\theta} - (w/2)\sin{\theta}$
  • $B:= -t_j^i\cos{\theta} + f\sin{\theta} + (w/2)\sin{\theta}$
  • $C:= -{t_j^w}_y t_j^i\cos{\theta} - f {t_j^w}_y \sin{\theta} - (w/2){t_j^w}_y\sin{\theta}.$

From here it's straightforward to plug these values into a solver like numpy.linalg.solve in Python.

In the 3D case (*) is: $$\begin{align} \vec{t_j^i}&=h^{-1} \circ \left[ \begin{array}{ccc} f_x & 0 & w/2 & 0\\ 0 & f_y & h/2 & 0 \\ 0 & 0 & 1 & 0 \\ \end{array} \right] * \left[ \begin{array}{ccc} A & B & C & -Ac_x^w -Bc_y^w -Cc_z^w \\ D & E & F & -Dc_x^w -Ec_y^w -Fc_z^w \\ G & H & I & -Gc_x^w -Hc_y^w -Ic_z^w \\ 0 & 0 & 0 & 1 \end{array} \right] * h \circ \vec{t_j^w}\\ &=h^{-1} \circ \left[ \begin{array}{ccc} f_x & 0 & w/2 & 0\\ 0 & f_y & h/2 & 0 \\ 0 & 0 & 1 & 0 \\ \end{array} \right] * \left[ \begin{array}{ccc} A & B & C & -Ac_x^w -Bc_y^w -Cc_z^w \\ D & E & F & -Dc_x^w -Ec_y^w -Fc_z^w \\ G & H & I & -Gc_x^w -Hc_y^w -Ic_z^w \\ 0 & 0 & 0 & 1 \end{array} \right] * \left[ \begin{array}{ccc} {t_j}_x^w \\ {t_j}_y^w \\ {t_j}_z^w \\ 1 \end{array} \right]\\ &=h^{-1} \circ \left[ \begin{array}{ccc} Af_x+Gw/2 & Bf_x+Hw/2 & Cf_x+Iw/2 & J\\ Df_y+Gw/2 & Ef_y+Fw/2 & Hf_y+Iw/2 & L\\ G & H & I & -Gc_x^w -Hc_y^w -Ic_z^w \end{array} \right] * \left[ \begin{array}{ccc} {t_j}_x^w \\ {t_j}_y^w \\ {t_j}_z^w \\ 1 \end{array} \right]\\ &=h^{-1} \circ \left[ \begin{array}{ccc} Af_x{t_j}_x^w+Gw/2{t_j}_x^w + Bf_x{t_j}_y^w+Hw{t_j}_y^w/2 + Cf_x{t_j}_z^w+Iw{t_j}_z^w/2 + J\\ Df_y{t_j}_x^w+Gw{t_j}_x^w/2 + Ef_y{t_j}_y^w+Fw{t_j}_y^w/2 + Hf_y{t_j}_z^w+Iw{t_j}_z^w/2 + L \\ M \end{array} \right],\end{align}$$ where:

  • $J:=-f_xAc_x^w -f_xBc_y^w -f_xCc_z^w -wGc_x^w/2 -wHc_y^w/2 -wIc_z^w/2$
  • $L:=-f_yDc_x^w -f_yEc_y^w -f_yFc_z^w -wGc_x^w/2 -wHc_y^w/2 -wIc_z^w/2$
  • $M:=G{t_j}_x^w + H{t_j}_y^w + I{t_j}_z^w -Gc_x^w -Hc_y^w -Ic_z^w$.

So we have for each tag $t_j$, the following equations:

$$\begin{align} M*{t_j}_x^i &= Af_x{t_j}_x^w+Gw/2{t_j}_x^w + Bf_x{t_j}_y^w+Hw{t_j}_y^w/2 + Cf_x{t_j}_z^w+Iw{t_j}_z^w/2 + J\\ M*{t_j}_y^i &= Df_y{t_j}_x^w+Gw{t_j}_x^w/2 + Ef_y{t_j}_y^w+Fw{t_j}_y^w/2 + Hf_y{t_j}_z^w+Iw{t_j}_z^w/2 + L\\ \end{align}$$

Since in 3-space a tag is a 2-d square, there are three linearly independent points $t_1$, $t_2$, $t_3$, each with two equations (above). If we only solve for $c^w$ this will leave the system overdetermined. So in fact we can solve for the rotation and translation with these equations! Rotation can be described as three Euler angles: $\theta, \phi, \psi$, and each of the elements $A, \ldots, I$ can be written as some combination of sin and cosine of the Euler angles. This means we are solving six equations for six unknowns.

2
On

Let the normal of the image plane be

$\hat{n} =[\cos(\theta) , \sin(\theta), 0]^T $

Given the location of the pinhole $c = [c_x, c_y, 0]^T$ , we can construct a frame attached to the camera, and with this frame

$ P = c + t R p $

where $p = [ p_x, p_y, f]^T$ is the image coordinate vector with respect to this camera frame.

The matrix $R$ is given by

$R = \begin{bmatrix} 0 && -\sin \theta && \cos \theta \\ 0 && \cos \theta && \sin \theta \\ -1 && 0 && 0 \end{bmatrix}$

Following the derivation here

$p_i = \dfrac{1}{k^T R^T (P_i - c ) } f R^T (P_i - c) $

so that

$(k^T R^T (P_i - c)) p_i = f R^T (P_i - c) $

where $k = [0, 0, 1]^T $

We're only concerned with the $y$ coordinates, hence

$ ( k^T R^T (P_i - c) ) j^T p_i = f j^T R^T (P_i - c) $

Hence,

$ R_3^T (P_i - c) (b_i) = f R_2^T (P_i - c) $

Where $R_k$ is the $k$-th column of $R$. These are two linear equations in $c$, and given $\theta$ can be solved.

The linear system is

$ (f R_2^T - b_1 R_3^T) c = f R_2^T P_1 - R_3^T P_1 $

$ (f R_2^T - b_2 R_3^T) c = f R_2^T P_2 - R_3^T P_2 $

Using $P_1 = (0, 1, 0)$ and $P_2 = (0, -1, 0) $ and plugging in the components of $R_2$ and $R_3$, we get the linear system

$\begin{bmatrix} - f \sin(\theta) - b_1 \cos(\theta) && f \sin(\theta) - b_1 \sin(\theta) \\ - f \sin(\theta) - b_2 \cos(\theta ) && f \sin(\theta) - b_2 \sin(\theta) \end{bmatrix} \begin{bmatrix} c_x \\ c_y \end{bmatrix} = \begin{bmatrix} f \cos(\theta) - \sin(\theta) \\ -f\cos(\theta) + \sin(\theta) \end{bmatrix}$

As an explicit numerical example, suppose $\theta = 30^\circ$, and $ f = -1 $ (note that $f$ must be negative, see the figure attached to the question). Also let the image of $P_1 = (0, 1, 0) $ be $p_1 = (0, 0.5, 0)$ and the image of $P_2 = (0, -1, 0) $ be $p_2 = (0, -0.1, 0) $

Then $b_1 = 0.5$ and $b_2 = -0.1$

Using these values in the above linear system, we get

$\begin{bmatrix} \sin(\theta) - 0.5 \cos(\theta) && - \sin(\theta) - 0.5 \sin(\theta) \\ \sin(\theta) +0.1 \cos(\theta ) && - \sin(\theta) + 0.1 \sin(\theta) \end{bmatrix} \begin{bmatrix} c_x \\ c_y \end{bmatrix} = \begin{bmatrix} - \cos(\theta) - \sin(\theta) \\ \cos(\theta) + \sin(\theta) \end{bmatrix}$

Using the values of $\sin(30^\circ) = \dfrac{1}{2} $ and $\cos(\theta) = \dfrac{\sqrt{3}}{2} $, the solution is

$ ( c_x , c_y ) = ( 4 , 2.178633 )$