I came across this post Distances defined in manifold of symmetric positive definite matrices because I had the same questions. I did not understand the answers provided, so I wanted to try to answer it myself. Originally, I posted my justifications and intuitions as an answer, but given I do not have a background in differential geometry and was unsure about my answer, I thought it would be better to post it as a question. The two main points I was trying to address were
"I always pictured it as a "curved sheet" like the shell of a sphere but not the sphere itself; therefore, the idea of calling the PSD cone a manifold was confusing"
The main question being asked: "if I understand correctly, SPD matrices lie inside a convex set i.e. linear combinations of SPD matrices will be SPD. Intuitively, I don't see the need of defining a distance that goes along the curvature of SPD matrices since the set is dense"
Do these justifications/intuitions correctly address those questions:
- "I always pictured it as a "curved sheet" like the shell of a sphere but not the sphere itself; therefore, the idea of calling the PSD cone a manifold was confusing"
Although it is easier to imagine a manifold as a hyper-surface, a manifold does not need to be in some ambient space of higher dimension to be a manifold. So take the PSD cone formed by the set of the set of 2x2 PSD matrices. This would be a 3 manifold since locally it is diffeomorphic to $\mathbb{R}^3$. Whether that cone is in $\mathbb{R}^3$, $\mathbb{R}^4$, or $\mathbb{R}^n$ does not make a difference. i.e. Manifolds are independent of the ambient space they live/are portrayed in..
- The main question being asked: "if I understand correctly, SPD matrices lie inside a convex set i.e. linear combinations of SPD matrices will be SPD. Intuitively, I don't see the need of defining a distance that goes along the curvature of SPD matrices since the set is dense"
Two pictures that help me conceptualize this idea. 
Picture 2 is from Riemannian geometry for EEG-based brain-computer interfaces; a primer and a review. This shows that the boundary is defined by the set of PSD matrices with determinant of 0 and that the interior of the cone contains positive definite matrices with a non zero determinant.
Picture 3 is from A pictorial representation of the positive semidefinite cone and is how I am viewing the PSD cone in practice. I am not sure if this is correct, but my interpretation is that the PSD cone can be viewed as a solid cone made up of infinitely many layers (level-sets) of other cones, where each layer of cone has a certain determinant (however, in this picture each layer of this cone looks like it includes the origin, but in the situation I am describing only the boundary when determinant is 0 would the origin be included). Every matrix on the outside boundary has a determinant of 0, every matrix on the next layer which is in the interior of the boundary has a determinant slightly greater than 0, and so on. As I explain below, I believe in practice, a set of sample covariance matrices usually come from one layer/a group of layers near the boundary. This idea is touched on in these posts Is the set of symmetric positive semi-definite matrices a smooth manifold with boundary and Positive semidefinite cone is generated by all rank-1 matrices..
I think theoretically if your sample covariance matrices were positive definite and had varying determinants than using Euclidean distance to measure the distance would make sense, since the PSD cone is an open convex set and has a non-empty interior with positive volume. However, in practice the sample covariance matrices are usually ill conditioned (eigenvalues lower in the spectrum are closer to zero) due to redundancies in the data. The way I see it is that this means two things. One, the data we are modeling is really close to the surface of the psd cone (at the surface the determinant is 0 and ill conditioned matrices have determinants approaching 0) which means they resemble something closer to the subset of PSD matrices which have determinant equal to 0. Two, the covariance matrices we are modeling have similar determinants which implies they live on some curved sub-manifold/layer of the psd cone. For these reasons we should use the Affine Invariant Riemannian Metric . Since the set of positive semi definite and positive definite matrices is convex, as was mentioned in the original post, the straight line distance will always be further in the interior of the cone. I believe the further you move into the interior of the cone, the higher your determinant is. Assuming your data all has similar determinants, taking the arithmetic mean would cause you to end up with an average matrix that has a larger determinant than the your sample covariance matrices. By using the Affine Invariant Riemannian Metric we will avoid doing things like considering covariance matrices of higher determinants because we are following a curved path. This idea of increasing determinant and determinant identity is discussed in footnote 13 of Riemannian geometry for EEG-based brain-computer interfaces; a primer and a review and Figure 4.1 in Geometric Means In A Novel Vector Space Structure On Symmetric Positive-Definite Matrices. This paper as well discusses and references some good sources that go over some of the difference between different means/metrics on the PSD cone Riemannian Metric and Geometric Mean for Positive Semidefinite Matrices of Fixed Rank.
I visualized this idea with real data below in image 10. Here I plotted the log determinant of 100 covariance matrices from the same class. I used the log scale on the x axis since all the determinants were numerically close to 0 and different magnitudes away from 0. I then took the determinant of the Riemannian Mean (mean induced by the Affine Invariant Riemannian Metric) of the covariance matrices as shown in green. This seems to correctly predict the mean determinant of the covariance matrices. I also, took the determinant of the arithmetic mean of the covariance, shown in red. The arithmetic average covariance matrix seems to have a determinant that is significantly higher than the class the average came from.


Validity of the 1st sentence depends on its interpretation. In modern topology and differential geometry, manifolds are indeed abstract and are not considered as subsets of some "ambient space." Nevertheless, every manifold $M$ is naturally embedded as a hypersurface in $M\times \mathbb R$. But, most likely, in the original question it was meant to be a hypersurface in some Euclidean space. Then the answer is that there are, for instance, surfaces, which cannot be embedded in $\mathbb R^3$, e.g. the projective plane and the Klein bottle.
This is correct but, actually, it is not only locally diffeomorphic to $\mathbb{R}^3$, but is globally diffeomorphic to $\mathbb{R}^3$. A bit cleaner statement would be:
The PSD cone $S_2^+$ is an open subset of the 3-dimensional real vector space $S_2$ of symmetric $2\times 2$ matrices. Hence, $S_2^+$ has a natural structure of a differentiable 3-dimensional manifold. As such, it is diffeomorphic to $\mathbb R^3$.
One cannot say "manifolds are independent." One can say that "being a topological manifold is an intrinsic property of a topological space and has nothing to do with its embedding, if it happens to be embedded in an ambient topological space." One can make a similar statement about differentiable manifolds, for instance:
A differentiable manifold is a topological space equipped with an atlas of charts satisfying certain properties. It is a theorem that every $n$-dimensional manifold can be embedded in $\mathbb R^{2n}$ but such an embedding does not change the intrinsic properties of the manifold.
Unstated in all this is the fact that the author of the original questions probably did not even mean a differentiable manifold but had in mind Riemannian manifolds. Only once you equip a manifold with a Riemannian metric (or a semi-Riemannian metric), you can talk about its curvature.