What's the meaning of this statement regarding dual affine coordinate systems?

146 Views Asked by At

I'm reading An Elementary Introduction to Information Geometry by Frank Nielsen.

On page 18, he says:

Thus the information manifold is both $^F \nabla$-flat and $^F \nabla^\ast$-flat: This structure is called a dually flat manifold (DFM). In a DFM, we have two global affine coordinate systems $\theta(\cdot)$ and $\eta (\cdot)$ related by the Legendre–Fenchel transformation of a pair of potential functions $F$ and $F^\ast$. That is, $( M, F ) \equiv ( M, F^\ast )$, and the dual atlases are $\mathcal A = \{( M, θ ) \}$ and $\mathcal A^\ast = \{( M, \eta ) \}$.

...

On a Bregman manifold, the primal parallel transport of a vector does not change the contravariant vector components, and the dual parallel transport does not change the covariant vector components. Because the dual connections are flat, the dual parallel transports are path-independent.

I'm trying to figure out what this means. Here are what I think are the relevant pieces I understand up to my point of confusion:

  • A manifold $M$ is $\nabla$-flat, for some connection $\nabla$, if the curvature tensor $R$ calculated from $\nabla$ has $R = 0$.
  • For a connection $\nabla$, we can find its conjugate connection $\nabla^\ast$ with respect to the metric $g$.
  • A "dually flat manifold" is therefore one where the curvature for both $\nabla$ and $\nabla^\ast$ is zero.
  • If we have a divergence $D: M \times M \rightarrow [0, \infty)$, then we can calculate a metric $^D g$ and connection $^D \nabla$ (and its dual $^{D^\ast} \nabla$) by differentiating $D$ in a specific way, giving us an information manifold $(M, ^D g, ^D \nabla, ^{D^\ast} \nabla)$ (which generally may not be flat).
  • Then, if we specifically use the Bregman divergence $B_F$ derived from a convex function $F(\theta)$ on $M$, and use this to calculate the metric $^F g$ and connection(s) $^F \nabla$, we find that the connection $^F \nabla = 0$, so it's a flat manifold, and the conjugate connection $^F \nabla^\ast = 0$ also, which makes it "dually flat".
  • Given the convex function $F(\theta)$, we can use a Legendre transformation to calculate the "dual potential function" $F^\ast (\eta) = \sup\limits_{\theta \in \Theta} \{ \theta^\top \eta - F(\theta) \}$
  • We can express the coordinates $\theta$ and $\eta$ in terms of each other via $\eta = \nabla F(\theta)$ and $\theta = \nabla F^\ast (\eta)$.

that brings us to the quote above, where I'm confused.

From what I thought, a connection $\nabla$ simply tells us how a tangent vector $v$ changes as we move in a direction $Y$, i.e., $\nabla_{v} Y$. So if a connection is flat, conceptually that means that if we move $v$ around, it "looks the same", i.e., it's not rotating or stretching (is that correct?).

However, even if $v$ stays the same for a given flat connection, depending on which coordinate system/vector basis we're expressing it in, with $v = v^i \partial_i$, it's possible that in one coordinate system, neither the components $v^i$ or the basis vectors $\partial_i$ change, but in another coordinate system the basis vectors $\partial_i$ change as we move around, and therefore the components $v^i$ must change too, so $v$ stays the same.

That's what I understand from the DFM with the two connections $\nabla, \nabla^\ast$. What I'm stuck on (assuming the above is correct) is whether it's true that if both connections are flat, does that mean they do the same thing to a vector, even if they change the components/basis vectors differently?