What's the other way to think about SE2 and SE3? My brain only seems to go one way

85 Views Asked by At

I work at a robotics company and we have littered throughout our code transforms of the form t_a_b. I continually think of t_a_b as "How do I get from a to b" where a and b are each a location with a heading.

If I take some global reference frame g, then nalgebra's inv_mul function is how I find the isometry between two points. I can express a as t_g_a and b as t_g_b, and then to get from a to b, it's just t_g_a.inv_mul(t_g_b). That is, if I know the latitude and longitude of SF, and I know that latitude and longitude of Mountain View, these are each describing t_(0,0)_SF and t_(0,0)_Mountain View, and then a really, really long way to get from SF to Mountain View would be to first go from SF to the map's origin point (equator x prime meridian) (which is the inverse of going from origin to SF) so that's t_(0,0)_SF.inv() and then go from the origin to Mountain View which is multiplying by t_(0,0)_Mountain View. Put it all together and I get t_(0,0)_SF.inv() * t_(0,0)_Mountain View but for performance and conciseness reasons I write this as t_(0,0)_SF.inv_mul(t_(0,0)_Mountain View) = t_SF_Mountain View.

Any time I see t_a_b * t_b_c * t_c_d in the code, this translates to me as "First go from a to b, then go from b to c, then go from c to d" so this is equivalent to the "rigid body transform" which takes you from a to d.

This has worked great for me and not caused any issues at all. It agrees with the way that everybody else writes code... Until I found out that one of my coworkers insists that I'm thinking about everything backwards. Definitely t_a_b means the exact opposite of what I think. When you want to get something something reference frame something a, you start at b and then apply the function which is multiplying t_a_b by pt_b.

No matter how hard I try, this makes absolutely no geometric sense to me, but for all my Googling and ChatGPTing, 30% of the time I'm right and my coworker's wrong and 70% of the time he's right and I'm wrong (but there definitely are a few sources that see it my way too). Somehow our code is able to interact with each other and not fall over and somehow our naming conventions agree with each other, but I just can't see what he's talking about. It sounds like utter gibberish. And the 70% of academic sources that agree with him also sound like gibberish.

I refuse to accept that I'm totally incapable of understanding this other conception of how SE3 transforms work. Can somebody please help me bridge this conceptual gap? (It may be telling that I've mostly worked on motion while he's mostly worked on perception. Also, my introduction to programming was with turtle graphics when I was 9 years old. Multiplying isometries just seems like sequencing instructions in turtle graphics)

1

There are 1 best solutions below

1
On

Without knowing the details of your code, let me take a guess at what your issue probably is. Misunderstandings of this sort are usually caused by disagreement on one of the following two issues:

(1) Are transformations active (objects move) or passive (coordinate systems / cameras move)?

(2) In a product of transformations $AB$, which is performed first, $A$ or $B$?

If two people disagree about exactly one of (1), (2), then their code will be incompatible, but if they disagree about both, then their code will be compatible again. This is because an active transformation described by a matrix $A$ is equivalent to a passive transformation described by $A^{-1}$. Since $(AB)^{-1} = B^{-1}A^{-1}$, taking inverses and reversing the order makes things compatible again.

Instead of active and passive, some sources use the terms alibi and alias. An active transformation moves the object to another place (alibi), while a passive transformation moves the coordinate system, giving the object new coordinates, i.e., another name (alias).

Does this help?