converting pose (which is a quaternion & a vector) from a coordinate system to another

62 Views Asked by At

The question is about a world and a camera that is defined in this world. I want to transform the pose (which is a rotation and a translation) of the camera given in the world coordinate system (called Unity) to a different world coordinate system (called COLMAP). Then, invert the pose, in COLMAP, to be a transformation from the world to the camera.

First, the input is expressed in Unity's left-handed Coordinate System, with the camera expressed with respect to the world.

In Unity, the world frame (also called global frame) and the camera frame (also called view frame or local frame) both use a left handed coordinate system in which:

  • +X axis points to the right

  • +Y axis points UP

  • +Z axis point forward

In Unity, the origin of the world frame is fixed to a known point, (0,0,0). I am given the pose of the camera frame with respect to the world frame, as a quaternion (4 dimensions) and a translation vector (3 dimensions). The quaternion is expressed as a rotation matrix from the camera to the world. The translation vector is also from the camera to the world.

Second, the destination is COLMAP's right-handed coordinate system, with the world expressed with respect to the camera.

In COLMAP:

  • +X axis points to the right

  • +Y axis points DOWN

  • +Z axis point forward

In COLMAP, the origin of the world frame is arbitrary (one of the images is chosen by the algorithm to be the world origin). The pose of the camera frame is what I need to transform from Unity. I also need to express it as a projection from the world frame to the camera frame such that the quaternion can be expressed as a rotation matrix $ R_c^w $ and the translation vector as $ t_c^w $. As stated by the documentation of COLMAP:

pose of an image is specified as the projection from world to the camera coordinate system of an image

After some research, the steps I thought I need to do are as follows:

  1. Convert the Unity's quaternion (x, y, z, scalar) from the left-handed coordinate system. The COLMAP's quaternion (i.e., $ R_w^c $) should be (-x, y, -z, scalar) as right-handed. The logic behind it is negating the y axis (to account for the difference in its direction) then negating the angle (to account for the conversion from left-handed system to right-handed system).
  2. Unity's quaternion is expressed as a transformation from the camera to the world, so (-x, y, -z, scalar) quaternion in COLMAP (i.e., $ R_w^c $) is still from the camera to the world. Therefore, I will invert the (-x, y, -z, scalar) quaternion in this step to make it a transformation from the world to the camera. $ R_c^w = inverse(-x, y, -z, scalar)$
  3. Convert the Unity's translation vector (x, y, z). It will be (x, -y, z) in COLMAP coordinate system. The logic is that the y-axis in COLMAP is opposite the y-axis in Unity.
  4. Unity's translation vector is expressed with respect to the world, so (x, -y, z) translation vector in COLMAP (i.e., $ t_w^c $) is also with respect to the world. To express (x, -y, z) with respect to the camera in COLMAP, I used the following equation: $ t_c^w = - R_c^w * (x, -y, z) = - R_c^w * t_w^c$

The final result, $R_c^w$ and $t_c^w$, is the pose of the camera in COLMAP expressed as a transformation from the world to the camera.

Applying these steps to a sequence of cameras (originating from Unity) results in an incorrect camera trajectory (as displayed in COLMAP). What am I missing? I apologize if my understanding of the topic is not comprehensive.

Thanks.