Suppose I am given a point $P(x, y, z)$ where $x, y, z$ are unknown. I take two images (perspective projections) of this point using a simple pinhole camera with an unknown focal length $f$, from two different known locations and orientations. How can I find the coordinates of $P$ ?
My attempt: Let the positions of the camera's pinhole be $A, B$ and the camera's orientations be given by the rotation matrix $R_A, R_B$, then
$P = A + t_A R_A Q_A $ and $P = B + t_B R_B Q_B$
where $Q_A , Q_B$ are the coordinates of the image with respect to frames $R_A, R_B$. They have the form
$Q_A = (x_1, y_1, z_0), Q_B = (x_2, y_2, z_0) $
where $(x_1, y_1), (x_2, y_2) $ are the coordinates of the image point on the plane of the image, and are known, and $z_0$ is the unknown focal length of the camera.
Using these the first two equations,
$ 0 = A - B + t_A R_A Q_A - t_B R_B Q_B $
which is a quadratic system of three equations (for the three components of this vector equation), in three unknowns $t_A, t_B, z_0$
Solving numerically, we get $t_A, t_B, z_0$, and from the first equation, we can compute the coordinates of $P$.
I assume you know the projected locations of the point in the two images, call these $A$ and $B$, and let the camera be at point $C$. Then the given point lies at the intersection of the two (infinite) lines $CA$ and $CB$.
If you’re working with floating point arithmetic, it’s quite likely that the two lines won’t intersect, so you should compute the mid-point of the shortest line between them.