In the Preface of 'Methods of Information Geometry':
We consider the set, $S$, of normal distributions $P(x; \mu, \sigma) = \dfrac{1}{\sqrt{2\pi }\sigma}\exp \left \{ - \dfrac{(x-\mu)^2}{2\sigma^2}\right \}$ with mean $\mu$ and variance $\sigma^2$.
So this means that if we specify $\mu,\sigma$,we determine a particular normal distribution. Hence the set of all possible $(\mu, \sigma)$ can be viewed as a 2-dimensional manifold, which has $(\mu, \sigma)$ as a coordinate.
My question is the statement that follows this: HOWEVER, this is not a Euclidean space, but rather a Riemannian space with a metric that naturally follows from underlying properties of probability distribution.
What does this mean? Can someone please explain it more gently? (And can some illustrations be made available for this?)
Thank you
I cannot, but David MacKay might be able to give you some insight into the first half. https://m.youtube.com/watch?v=HrRNqb5C-b0 skip to 13:54. David explains how to test multiple theories to explain data where each theory is a Gaussian with mean and variance specified by mu and sigma. He explains why one might want to do that and does a really nice visual walkthrough. I'm assuming that being able to redefine into Riemann space allows for some mathematical shortcuts, but I'm no expert in that part!