Note: The superscript notation used refers to the frame of reference. There are three frames of reference:
- $w$, the world frame (in Euclidean 2-space),
- $c$, the camera frame (in Euclidean 2-space), and
- $i$, the image frame (in pixels).
Suppose we have:
- a 1-dimensional "tag", or "fiducial marker", $t$, defined by its boundaries, $t:=((0,-1)^w,(0,1)^w)$, existing at between $(0,-1)^w$ and $(0,1)^w$,
- a "camera" at $c^w∈ℝ^2$, focal length $f^i>0$, image plane with width $w^i>0$ (whose center, $d^i:=w^i/2$, is $f^i$ units from $c$).
The optical axis is the line that includes the points $c$, $d$. Let's define $q$ as the point where the optical axis intersects the $y$ axis.
Define $b^i$ and $c^i$ as the tag's boundaries projected onto the image plane; that is, they are scalars drawn from $[0,w]$ which give some distance along the image plane, such that:
- $b^i$ is the point intersecting the image plane of the line from $(0,1)^w$ to $c^w$ and
- $c^i$ is the point intersecting the image plane of the line from $(0,-1)^w$ to $c^w$.
Let's assume the tag is in view of the camera, that is, assume that $b^i,c^i∈[0,w]$.
Finally, let $R$ be a $2 \times 2$ rotation matrix such that the line with points $R[0,1]$, $R[0,-1]$ is perpendicular to the optical axis.
Note: $f^i$, $w^i$, $b^i$, $c^i$, and $d^i$ are expressed in pixels, not necessarily in the same unit scale as the x-y plane of the world frame and camera frame.
Problem: Given $f^i>0$, $w^i>0$, $b^i∈[0,w]$, $c^i∈[0,w]$, and $R∈ℝ^{2 \times 2}$, find:
- Camera position $c^w∈ℝ^2$ and
- A function mapping any point in world space into the camera image plane: $$Ω:ℝ^2 → [0,w] : p^w → p^i$$
Bonus points:
- Generalize to any tag position $t$.
- Generalize to $n$ dimensions.

First, it's convenient to refer to have a one-dimensional frame of reference originating from the center of the image plane $x^c = f$, rather than from the edge of it. So define $d^i:=w^i/2$ and for all $p^i∈ℝ$, define $p_0^i:=p^i-d^i$ and $Ω_0(p^w):=Ω(p^w)-d^i$.
From another proof from Moving a pinhole camera we know that for some $p^w$ in world coordinates,
$Ω_0(p^w) = \dfrac{(0, 1) f R^T (p^w - c^w)} {(1, 0) R^T (p^w - c^w)}$.
Now that we have solved the second half of the problem, let's go back and solve the first half, that is, let's find $c^w$.
Let $R = \begin{bmatrix} i & j\\ k & l \end{bmatrix}$. Since $R$ is a rotation matrix, $R^{-1}=R^T= \begin{bmatrix} i & k\\ j & l \end{bmatrix}$.
Plugging in $Ω_0((0,1)) = b_0^i$ and $Ω_0((0,-1)) = c_0^i$, we can solve for $c^w$:
So,
So,
So,
So,
This is a system of linear equations which can be solved for $c_x^w$ and $c_y^w$ by standard methods.
QED.
Bonus: