This thing is making me going crazy, mathematicians and physicists use different notations for spherical polar coordinates.
Now during a vector calculus problem I had the following issue: Had to find $d\underline{S}$ for the surface of a sphere of radius $a$ centred at the origin. In all the books I always find that for a parametrised surface $\underline{r}(s,t)$ we have $d\underline{S} = \left(\frac{\partial \underline{r}}{\partial s}\times\frac{\partial \underline{r}}{\partial t}\right)dsdt$ in this order.
For the sphere I have $\underline{r}(\theta,\phi) = a\cos(\theta)\sin(\phi)\underline{i}+a\sin(\theta)\sin(\phi)\underline{j}+a\cos{\phi}\underline{k}$ for $0\leq \theta\leq 2\pi$ and $0\leq \phi\leq \pi$ And hence I get $\frac{\partial \underline{r}}{\partial \theta}\times\frac{\partial \underline{r}}{\partial \phi} = -\underline{r}a\sin{\phi} d\theta d\phi$ which points inwards so I take the opposite of it.
In my notes, they always preserve the order I preserved here (i.e. the first partial on the left (i.e. $\frac{\partial}{\partial \theta}$) is the first component in the brackets of $\underline{r}(\theta,\phi)$). Preserving the order I should always get the correct normal vector. However for some weird reason when in my notes, in the books and online people have to calculate $d\underline{S}$ for a sphere (like here) they always invert the coordinates and write the spherical coordinates as $(r,\theta,\phi)$ for $0\leq \theta \leq \pi$ and $0\leq \phi \leq 2\pi$ and $\underline{r}(\theta,\phi)$ with $\frac{\partial \underline{r}}{\partial \theta}\times\frac{\partial \underline{r}}{\partial \phi} = \underline{r}a\sin{\theta} d\theta d\phi$
why does this happen? It's just a notation convention, however the order of the partial should give the correct normal, although in my example it clearly gives the opposite, while using the other notation, it gives the correct one.
You seem to have stumbled onto an example of the right-hand rule. Consider $\frac{d\underline{r}}{d\theta} \times \frac{d\underline{r}}{d\phi}$ vs. $\frac{d\underline{r}}{d\phi} \times \frac{d\underline{r}}{d\theta}$. They will both produce the same magnitude, as well as the same vector (up to sign). That is, $\frac{d\underline{r}}{d\phi} \times \frac{d\underline{r}}{d\theta} = -\left(\frac{d\underline{r}}{d\theta} \times \frac{d\underline{r}}{d\phi}\right)$. This is because the cross product is anticommutative. The correct normal is dictated by the problem or application for the problem. So, if you were asked to find the outward-pointing normal that would be the correct one. It doesn't really matter if you are careful about the order of cross product because identifying if you crossed in the order the problem calls for is relatively easy to do.