Correct non-nadir view for calculation of the ground sampling distance (GSD)

1.3k Views Asked by At

I am working on a UAV project that requires calculating the ground sampling distance (GSD) in order to retrieve the meter/pixel scale. The GSD for nadir view (camera looking directly to the ground) formula is as follows

$$\mbox{GSD} = \frac{\text{flight altitude} \times \text{sensor height}}{\text{focal length} \times \text{image height and/or width}}$$

and I read on multiple articles like this one. That if the camera has a tilt angle on one axis a correction as follow is required

$$\mbox{GSD}_{\text{Rate}} = \frac{1}{\cos(\theta + \varphi)}$$

where $\theta$ is the tilt angle and $\varphi$ as they said in the article

$\varphi$ describes the angular position of the pixel in the image: it is zero in correspondence of the optical axis of the camera, while it can have positive or negative values for the other pixels

and the figure on their article is the following

I hope you are on the same page as me, now I have two questions:

  1. First how do I exactly calculate the angular position of a given pixel with respect to the optical axis (how to calculate $\varphi$)

  2. The camera in my case is rotated on two axis & not just one like their example, like the camera doesn't look exactly to the road but like oriented to one of the sides, more like this one:

So, would there be more changes on the formula? I am not sure how to get the right formula geometrically.

Edit 1. The prior informations are as follows :

  • I have the camera matrix for the intrinsic parameters
  • The rotations are as follow Tilt than Pan
  • Finally the altitude of the camera is known as well
1

There are 1 best solutions below

8
On BEST ANSWER

Question 1:

  1. Assume your screen coordinate system is centered at one of the corners of the screen field and the axes are aligned with the two perpendicular edges of the screen field meeting at that corner.

  2. Assume you know the position of the orthogonal projection $C$ of the focal point $F$ of the camera onto the screen (for example, it is the center of the rectangular field of the camera as it looks like on the picture you have attached to your post). Let the position of the orthogonal projection $C$ of the focal point $F$ has coordinates $(c_1, \, c_2)$ in pixel units.

  3. Assume each pixel is a square of edge-length $\text{px}$.

  4. Assume you know the focal distance $f$ between the focal point $F$ of the camera and the screen of the camera, i.e. if $F$ is the focal point, then you know $\text{dist}(F, \, C) = f$.

  5. Assume you are given a pixel $P$ on the screen with pixel coordinates $(x_{px}, \, y_{px})$

Then, the angle $\phi$ between the pixel $P$ and the camera's optical axis $FC$ is

$$\tan(\phi) \, = \, \text{px} \, \frac{ \sqrt{\,(x_{px}^2 - c_1)^2 + (y_{px}^2 - c_2)^2\,}}{f} $$

$$\phi = \arctan\left(\text{px} \, \frac{ \sqrt{\,(x_{px}^2 - c_1)^2 + (y_{px}^2 - c_2)^2\,}}{f}\right) $$

Also, one probably needs $\cos(\phi)$ and $\sin(\phi)$ rather then $\phi$ itself, so

$$\cos(\phi) \, = \, \text{px} \, \frac{f}{ \sqrt{\,(x_{px}^2 - c_1)^2 + (y_{px}^2 - c_2)^2 + \text{px}^2 f^2\,}} $$

$$\sin(\phi) \, = \, \text{px} \, \frac{\sqrt{\,(x_{px}^2 - c_1)^2 + (y_{px}^2 - c_2)^2\,}}{ \sqrt{\,(x_{px}^2 - c_1)^2 + (y_{px}^2 - c_2)^2 + \text{px}^2 f^2\,}} $$

Question 2:

Yes, there will be significant changes. In the case on the diagram with the car and the drone, the vertical axis $H$, the camera's optical axis and the line connecting the drone with the car are coplanar (all three lie in the same plane) and it is very easy to calculate the angle between the vertical axis $H$ and the car as the sum of the angle $\theta$, between the vertical axis $H$ and the camera axis, with the angle $\phi$, between the camera axis and the car. But in general, the three lines above are not coplanar. If you know the angle $\psi$ between (i) the plane formed by the vertical axis $H$ and the camera's optical axis $FC$, and (ii) the plane formed by the camera's optical axis $FC$ and the line connecting the drone with the car, then the angle $\sigma$ between the vertical axis $H$ and the line between the drone and the car is calculated by the spherical law of cosines $$\cos(\sigma) = \cos(\theta) \cos(\phi) + \sin(\theta) \sin(\phi)\cos(\psi)$$ and then $$\text{GSD}_{\text{rate}} = \frac{1}{\cos(\sigma)} = \frac{1}{\cos(\theta) \cos(\phi) + \sin(\theta) \sin(\phi)\cos(\psi)}$$ In the simplified case, the three lines are coplanar exactly when $\psi = \pi$, which implies $\cos(\pi) = -1$, and then $$\cos(\theta) \cos(\phi) + \sin(\theta) \sin(\phi)\cos(\pi) = \cos(\theta) \cos(\phi) - \sin(\theta) \sin(\phi)$$ and then $$\cos(\theta) \cos(\phi) - \sin(\theta) \sin(\phi) = \cos(\theta + \phi)$$ Thus, you recover the original simplified formula. The angle $\psi$ can be calculated from the image on the screen, kind of like in a manner very similar to the answer of question 1, as long as we know the pixel coordinates $(x_{\text{vert}}, \, y_{\text{vert}})$ of the point $Q$ at which the vertical axis $H$ intersects the plane of the screen. Then by the Euclidean law of cosines $$|PQ|^2 = |PC|^2 + |QC|^2 - 2\, |PC| |QC| \cos(\psi)$$ so $$\cos(\psi) = \frac{\,|PC|^2 \, + \, |QC|^2 \, - \, |PQ|^2\,}{2 \, |PC| |QC|}$$ or more explicitly $$\cos(\psi) = \frac{\,(x_{\text{px}} - c_1)^2 + (y_{\text{px}} - c_2)^2 \, + \, (x_{\text{vert}} - c_1)^2 + (y_{\text{vert}} - c_2)^2\, - \, (x_{\text{px}} - x_{\text{vert}})^2 - (y_{\text{px}} - y_{\text{vert}})^2\,}{2 \, \sqrt{(x_{\text{px}} - c_1)^2 + (y_{\text{px}} - c_2)^2\, } \, \sqrt{(x_{\text{vert}} - c_1)^2 + (y_{\text{vert}} - c_2)^2}}$$ Alternatively, you can use the dot product formula $$\cos(\psi) = \frac{(x_{\text{px}} - c_1)(x_{\text{vert}} - c_1) + (y_{\text{px}} - c_2)(y_{\text{vert}} - c_2)}{ \sqrt{(x_{\text{px}} - c_1)^2 + (y_{\text{px}} - c_2)^2\, } \, \sqrt{(x_{\text{vert}} - c_1)^2 + (y_{\text{vert}} - c_2)^2}}$$

Edit 1. How to calculate the coordinates $(x_{\text{vert}}, \, y_{\text{vert}})$. Assume you can determine two points $(x_1, \, y_1)$ and $(x_2, \, y_2)$ on the screen lying on the edge of an object or on an axis that is the projection of an object or and axis in 3D which is perpendicular to the ground in 3D. On the picture for example, the grey pole in the middle could be one such object (or it could be the vertical edge of a building or something like that). Then, construct the unit vector $(u_{\text{vert}}, \, v_{\text{vert}})$ where \begin{align} u_{\text{vert}} \, &=\, \frac{x_2 \, -\, x_1}{\sqrt{(x_2-x_1)^2 + (y_2 - y_1)^2}}\\ v_{\text{vert}} \, &=\, \frac{y_2 \, -\, y_1}{\sqrt{(x_2-x_1)^2 + (y_2 - y_1)^2}} \end{align} Then \begin{align} x_{\text{vert}} \, &=\, c_1 \, +\, f\,\tan(\theta)\, u_{\text{vert}}\\ y_{\text{vert}} \, &=\, c_2 \, +\, f\, \tan(\theta)\, v_{\text{vert}}\\ \end{align}

Edit 2.

  1. Assume the camera is initially aligned with the vertical axis $H$.

  2. Assume, in order to describe the motion of the camera better, we translate the camera's coordinate system at the projected focal center $C$, so that the world's coordinate system axes $x$ and $y$ are exactly aligned with the camera's coordinate axes $x$ and $y$.

  3. Assume that the camera is first tilted at the angle $\theta$ and after that rotated (around the vertical axis $H$) at the angle $\lambda$ (which looks like it is the case on the photo). Now, during the $\theta-$tilt the camera's $y-$axis is rotated in 3D space, but it always intersects the vertical axis $H$. After that, when the $\lambda-$rotation around $H$ takes place, the camera's $y-$axis is rotated again, but its intersection point with the $H$ axis stays fixed (because every point on the $H$ axis stays fixed during a rotation around $H$). That intersection point is $(x_{\text{vert}}, \, y_{\text{vert}})$. Therefore, the latter lies on the $y-$axis of the camera's coordinate system, centered at $C$. Consequently, \begin{align} x_{\text{vert}} \, &=\, c_1 \\ y_{\text{vert}} \, &=\, c_2 \, - \, f\,\tan(\theta)\\ \end{align}

In this case, the formula for $\cos(\psi)$ simplifies to

$$\cos(\psi) = \frac{ c_2 - y_{\text{px}} }{ \sqrt{(x_{\text{px}} - c_1)^2 + (y_{\text{px}} - c_2)^2\, } }$$

Comment. I am not an expert on pixels to be honest, but I guess common sense dictates that each pixel is a little square, whose edges are parallel to the screen's coordinate axes. The pixels have the same edge-length, I called pixel size, and I denoted their edge-length by $\text{px}$ centimeters or milimeter, whichever you have as information. When using parameters from measurements on the screen in terms of pixels, we convert them to metric measurements by multiplying them by pixel size $\text{px}$. That is why the first formulas, that feature pixel coordinates and focal distance $f$ require scaling by pixel-size. But when working with measurements only from the screen, then no need to multiply by pixel-size, because everything is a ratio, so they cancel out.