I take two photographs of the night sky, a few seconds to a few minutes apart. These images shall be denoted by image 1 and image 2. Due to the rotation of the Earth, the stars will have moved from one image to the other. I would like to determine the transformation from image 1 to image 2. Roughly speaking, I follow the pinhole camera model, assuming no lens distortion and that the image plane is perfectly perpendicular to the optical axis.
The idea is: let $x,y$ denote the pixel coordinates in image 1, and let $x',y'$ denote the coordinates in image 2. I want to determine the function $\mathbf{\hat{F}}$ that best approximates $(x',y') = \mathbf{F}(x,y;\mathbf{p})$. $\mathbf{F}$ follows the (geometric) camera and movement models (see Sections 1 and 2 below), with parameters $\mathbf{p}$; the approach to find $\mathbf{\hat{F}}$ is as follows:
Given $N$ points $(x_k,y_k)$ in image 1 where we have the corresponding points $(x'_k, y'_k)$ in image 2 ($1 \leqslant k \leqslant N$), we minimize the "error" function with respect to the parameters $\mathbf{p}$ (details in Section 3 below): $$e(\mathbf{p}) = \displaystyle \sum_{k=1}^N {\left| (x'_k, y'_k) ~-~ \mathbf{F}(x_k,y_k; \mathbf{p}) \right|}^2$$
An implementation in C++ of this method is not working. Even with the same points as source and transformed set (thus, the error function should be already zero and the parameters should be at the position of the minimum), the first gradient gives me near-zero, but by the third iteration, the gradient is all wacko and the procedure is clearly diverging.
My questions are:
- Are my models, formulation, geometry, assumptions, correct?
- Is there any ill-formed aspect in my formulation of the problem?
- Can you see any errors in the formulas, derivatives, etc.?
1. Key detail behind my approach and model
For images of the night sky, everything in the scene is effectively a "point at infinity", so instead of $x,y$ (both in the scene and the image), we can think in terms of angles $\alpha, \beta$, and disregard the scene's $z$-coordinate (or $x_3$-coordinate, in Wikipedia's notation): \begin{eqnarray} \def\fx{f_\mathrm{x}} \def\fy{f_\mathrm{y}} \tan{\alpha} &=& \frac{~x-x_0~}{\fx} \\ \tan{\beta} &=& \frac{~y-y_0~}{\fy} \end{eqnarray}
In these angular coordinates, any transformation (camera moved, or celestial sphere moved because the two images are taken a few seconds or minutes apart) is accurately modeled as a rotation followed by a translation (translation in the angular coordinates): \begin{eqnarray} \alpha_k' &=& \alpha_k \cos{\theta} - \beta_k \sin{\theta} + \delta_1 \label{eq:alpha-prime} \\ \beta_k' &=& \alpha_k \sin{\theta} + \beta_k \cos{\theta} + \delta_2 \label{eq:beta-prime} \end{eqnarray} where $\theta$ is the angle of rotation around the optical axis, and $\delta_1, \delta_2$ represent the translation (in angular coordinates).
2. Camera Model and Transformation
The parameters (denoted as vector $\mathbf{p}$ above) are $x_0, y_0, f_\mathrm{x}, f_\mathrm{y}, \theta, \delta_1, \delta_2$:
- $x_0, y_0$ is the point where the optical axis intercepts the image plane (in principle, it should be the center of the image, but due to imperfect camera geometry, I consider it variable)
- $f_\mathrm{x}, f_\mathrm{y}$ correspond to the focal distance (I allow a separate focal distance for each axis due to imprecise pixel and sensor geometry, possibly producing different scaling factors for each axis)
- $\theta, \delta_1, \delta_2$ represent the rotation and translation in the angular coordinates domain (as explained above, in Section 1)
The transformation, fully expanded, would be something like: \begin{eqnarray} \def\fx{f_\mathrm{x}} \def\fy{f_\mathrm{y}} x' &=& x_0 + \fx \tan{\alpha'} \nonumber \\ &=& x_0 + \fx \tan{\left(\alpha \cos{\theta} - \beta \sin{\theta} + \delta_1 \right)} \nonumber \\ &=& x_0 + \fx \tan{\left(\arctan{\left( \frac{~x-x_0~}{\fx} \right)} \cos{\theta} - \arctan{\left( \frac{~y-y_0~}{\fy} \right)} \sin{\theta} + \delta_1 \right)} \label{eq:x-expanded} \\ y' &=& y_0 + \fy \tan{\left(\arctan{\left( \frac{~x-x_0~}{\fx} \right)} \sin{\theta} + \arctan{\left( \frac{~y-y_0~}{\fy} \right)} \cos{\theta} + \delta_2 \right)} \label{eq:y-expanded} \end{eqnarray}
3. Formulating the Optimization Problem and Applying Gradient Descent
The error function to be minimized (with respect to the parameters) is: \begin{equation} \def\fx{f_\mathrm{x}} \def\fy{f_\mathrm{y}} e(x_0, y_0, \fx, \fy, \theta, \delta_1, \delta_2) ~=~ \sum_{k=1}^N {\left({x'_k - \left({x_0 + \fx \tan{\alpha_k'}}\right)}\right)}^2 + {\left({y'_k - \left({y_0 + \fy \tan{\beta_k'}}\right)}\right)}^2 \end{equation} (for simplicity, I don't use the fully expanded transformation as above; instead, I leave it expressed in terms of $\alpha'_k$ and $\beta'_k$, and use chain rule to obtain the derivatives — details in Section 3.1).
The initial estimate for the gradient descent is:
- $(x_0,y_0) ~\leftarrow~$ center of the image
- $f_\mathrm{x}, ~f_\mathrm{y} ~\leftarrow~$ the focal distance of the lens (properly scaled to pixels, knowing that the sensor is approximately 30mm by 20mm)
- $\theta, \delta_1, \delta_2 ~\leftarrow~ 0$ (each of the three is given an initial value of 0; I do this for two images taken with the camera fixed (on a tripod) and taken just a few seconds apart, so the camera's rotation is nearly-zero).
3.1. Obtaining the Gradient
\begin{eqnarray} \def\fx{f_\mathrm{x}} \def\fy{f_\mathrm{y}} \newcommand{\Partial}[2]{\frac{\partial{#1}}{\partial{#2}}} \Partial{e}{x_0} &=& -2 \sum_{k=1}^N \left({x'_k - \left({x_0 + \fx \tan{\alpha_k'}}\right)}\right) \, \left( 1 + \fx \sec^2{\alpha_k'} \; \Partial{\alpha_{_k}'}{x_0} \right) \nonumber \\ &~& ~~~~~~~~~ + \left({y'_k - \left({y_0 + \fy \tan{\beta_k'}}\right)}\right) \, \fy \, \sec^2{\beta_k'} \; \Partial{\beta_k'}{x_0} \\ &~& \nonumber \\ \Partial{e}{y_0} &=& -2 \sum_{k=1}^N \left({x'_k - \left({x_0 + \fx \tan{\alpha_k'}}\right)}\right) \, \fx \, \sec^2{\alpha_k'} \; \Partial{\alpha_{_k}'}{y_0} \nonumber \\ &~& ~~~~~~~~~ + \left({y'_k - \left({y_0 + \fy \tan{\beta_k'}}\right)}\right) \, \left( 1 + \fy \sec^2{\beta_k'} \; \Partial{\beta_k'}{y_0} \right) \\ &~& \nonumber \\ \Partial{e}{\fx} &=& -2 \sum_{k=1}^N \left({x'_k - \left({x_0 + \fx \tan{\alpha_k'}}\right)}\right) \, \left( \tan{\alpha_k'} + \fx \, \sec^2{\alpha_k'} \; \Partial{\alpha_{_k}'}{\fx} \right) \nonumber \\ &~& ~~~~~~~~~ + \left({y'_k - \left({y_0 + \fy \tan{\beta_k'}}\right)}\right) \, \fy \, \sec^2{\beta_k'} \; \Partial{\beta_k'}{\fx} \\ &~& \nonumber \\ \Partial{e}{\fy} &=& -2 \sum_{k=1}^N \left({x'_k - \left({x_0 + \fx \tan{\alpha_k'}}\right)}\right) \, \fx \, \sec^2{\alpha_k'} \; \Partial{\alpha_{_k}'}{\fy} \nonumber \\ &~& ~~~~~~~~~ + \left({y'_k - \left({y_0 + \fy \tan{\beta_k'}}\right)}\right) \, \left( \tan{\beta_k'} + \fy \, \sec^2{\beta_k'} \; \Partial{\beta_k'}{\fy} \right) \\ &~& \nonumber \\ \Partial{e}{\theta} &=& -2 \sum_{k=1}^N \left({x'_k - \left({x_0 + \fx \tan{\alpha_k'}}\right)}\right) \, \fx \, \sec^2{\alpha_k'} \; \Partial{\alpha_{_k}'}{\theta} \nonumber \\ &~& ~~~~~~~~~ + \left({y'_k - \left({y_0 + \fy \tan{\beta_k'}}\right)}\right) \, \fy \, \sec^2{\beta_k'} \; \Partial{\beta_k'}{\theta} \\ &~& \nonumber \\ \Partial{e}{\delta_1} &=& -2 \fx \sum_{k=1}^N \left({x'_k - \left({x_0 + \fx \tan{\alpha_k'}}\right)}\right) \, \sec^2{\alpha_k'} \\ &~& \nonumber \\ \Partial{e}{\delta_2} &=& -2 \fy \sum_{k=1}^N \left({y'_k - \left({y_0 + \fy \tan{\beta_k'}}\right)}\right) \, \sec^2{\beta_k'} \end{eqnarray} where: \begin{align} \def\fx{f_\mathrm{x}} \def\fy{f_\mathrm{y}} \newcommand{\Partial}[2]{\frac{\partial{#1}}{\partial{#2}}} \def\PartialAlphaXzero{\frac{-1}{\fx \left(1 + \left(\frac{~x_k-x_0~}{\fx}\right)^2\right)}} \def\PartialAlphaFx{\frac{1}{1 + \left(\frac{~x_k-x_0~}{\fx}\right)^2} \,\cdot\, \frac{x_0 - x_k}{\fx^2}} \def\PartialBetaYzero{\frac{-1}{\fy \left(1 + \left(\frac{~y_k-y_0~}{\fy}\right)^2\right)}} \def\PartialBetaFy{\frac{1}{1 + \left(\frac{~y_k-y_0~}{\fy}\right)^2} \,\cdot\, \frac{y_0 - y_k}{\fy^2}} \def\PartialAlphaprimeXzero{\cos{\theta} \,\cdot\, \Partial{\alpha_{_k}}{x_0}} \def\PartialAlphaprimeYzero{-\sin{\theta} \,\cdot\, \Partial{\beta_k}{y_0}} \def\PartialAlphaprimeFx{\cos{\theta} \,\cdot\, \Partial{\alpha_{_k}}{\fx}} \def\PartialAlphaprimeFy{-\sin{\theta} \,\cdot\, \Partial{\beta_k}{\fy}} \def\PartialBetaprimeXzero{\sin{\theta} \,\cdot\, \Partial{\alpha_{_k}}{x_0}} \def\PartialBetaprimeYzero{\cos{\theta} \,\cdot\, \Partial{\beta_k}{y_0}} \def\PartialBetaprimeFx{\sin{\theta} \,\cdot\, \Partial{\alpha_{_k}}{\fx}} \def\PartialBetaprimeFy{\cos{\theta} \,\cdot\, \Partial{\beta_k}{\fy}} \Partial{\alpha_{_k}'}{x_0} &= \PartialAlphaprimeXzero & \Partial{\alpha_{_k}'}{y_0} &= \PartialAlphaprimeYzero \\ \Partial{\alpha_{_k}'}{\fx} &= \PartialAlphaprimeFx & \Partial{\alpha_{_k}'}{\fy} &= \PartialAlphaprimeFy \\ \Partial{\alpha_{_k}'}{\theta} &= -(\alpha_k \sin{\theta} + \beta_k \cos{\theta}) & \Partial{\alpha_{_k}'}{\delta_1} &= 1 ~,~~~~ \Partial{\alpha_{_k}'}{\delta_2} ~=~ 0 \end{align} \begin{align} \Partial{\beta_k'}{x_0} &= \PartialBetaprimeXzero & \Partial{\beta_k'}{y_0} &= \PartialBetaprimeYzero \\ \Partial{\beta_k'}{\fx} &= \PartialBetaprimeFx & \Partial{\beta_k'}{\fy} &= \PartialBetaprimeFy \\ \Partial{\beta_k'}{\theta} &= \alpha_k \cos{\theta} - \beta_k \sin{\theta} & \Partial{\beta_k'}{\delta_1} &= 0 ~,~~~~ \Partial{\beta_k'}{\delta_2} ~=~ 1 \end{align} where: \begin{eqnarray} \Partial{\alpha_{_k}}{x_0} &=& \PartialAlphaXzero \\ \Partial{\alpha_{_k}}{\fx} &=& \PartialAlphaFx ~=~ \Partial{\alpha_{_k}}{x_0} \,\cdot\, \frac{x_k - x_0}{\fx} \\ \Partial{\alpha_{_k}}{y_0} &=& \Partial{\alpha_{_k}}{\fy} ~=~ \Partial{\alpha_{_k}}{\theta} ~=~ \Partial{\alpha_{_k}}{\delta_1} ~=~ \Partial{\alpha_{_k}}{\delta_2} = 0 \end{eqnarray} \begin{eqnarray} \Partial{\beta_k}{y_0} &=& \PartialBetaYzero \\ \Partial{\beta_k}{\fy} &=& \PartialBetaFy ~=~ \Partial{\beta_k}{y_0} \,\cdot\, \frac{y_k - y_0}{\fy} \\ \Partial{\beta_k}{x_0} &=& \Partial{\beta_k}{\fx} ~=~ \Partial{\beta_k}{\theta} ~=~ \Partial{\beta_k}{\delta_1} ~=~ \Partial{\beta_k}{\delta_2} = 0 \end{eqnarray}