What is the right way of making sense of generalized functions over manifolds? For concreteness, let me restrict my question to the dirac delta function. The article on Wikipedia on Dirac delta function gives a definition of the composition of $\delta$ with continuously differentiable functions $g\colon \mathbb{R}\to \mathbb{R}$ as follows: $$\delta(g(x)):=\sum_i\frac{\delta(x-x_i)}{|g'(x_i)|}$$ where the sum extends over all roots of $g$ which are assumed to be simple. A similar definition is also given for the case where $g$ is a multivariable function.
My question is whether such definitions can be extended to functions $g$ defined over manifolds, say $S^1$ or $S^2$. Over the circle, it seems to me that the only reasonable way of making sense of something like $$\int_{S^1}\delta(|x-i|)f(x)\, dx$$ is by assigning the value $f(i)$ to it which agrees with the naive approach of parameterizing the circle and reducing the problem to the real case: $$\int_{S^1}\delta(|x-i|)f(x)\, dx"="\int_0^{2\pi}\delta(\cos\theta, \sin\theta-1)f(\cos\theta, \sin\theta)\, d\theta.$$
However, for $S^2$ I think a similar computation with, say the north pole $n$ in place of $i$, is ambiguous: $$\int_{S^2}\delta(|x-n|)f(x)\, dx"="\int_{0}^{2\pi}\int_{0}^{\pi}\delta(\text{blah})f(\text{blah})\sin\phi\, d\phi\, d\theta.$$
It seems to me that the values of the right-hand integral depends on the parameterization and is affected by the value of the sine function.
I think the first step in answering such a question is perhaps to make sense of a composition like $\delta(|x|)$ for $x\in \mathbb{R}$.
Question Is there anywhere in math/physics literature that I can find the definition of $\delta(|x|)$ (or a similar composition)?
Any thoughts, comments, insights, or references are highly appreciated.
If $\phi:\mathbb R^n\to\mathbb R$ is $C^1$ near the level set $L=\{x\in\mathbb R^n;\phi(x)=0\}$ and $\nabla\phi\neq0$ on $L$, then it is natural to interpret $\delta(\phi(x))$ so that $$ \int_{\mathbb R^n}\delta(\phi(x))f(x)dx = \int_L|\nabla\phi(x)|^{-1}f(x)d\Sigma(x), $$ where $\Sigma$ is the surface measure of $L$. In one dimension this is just the summation formula you wrote.
Now $\phi(x)=|x|$ is not differentiable at the origin, so it is not obvious how to define $\delta(|x|)$. This function does satisfy $|\nabla\phi|=1$ everywhere outside the origin, so we could take $|\nabla\phi(0)|=1$ as well (without defining $\nabla\phi(0)$!) and and try to use the formula above. If $n\geq2$, $L=\{0\}$ has measure zero with respect to the $n-1$ dimensional measure $\Sigma$. This suggests the interpretation that $\delta(|x|)=0$ if $n\geq2$.
One option is to consider $\phi(x)=|x|-r$ for $r>0$ and take the limit $r\to0$. Now $\phi$ is smooth near $L=S_r$ (sphere or radius $r$), so $$ \int_{\mathbb R^n}\delta(|x|-r)f(x)dx = \int_{S_r}f(x)d\Sigma(x). $$ If $f$ is continuous at the origin and $n\geq2$, in the limit $r\to0$ the above integral tends to zero. This supports the interpretation above. If $f$ has a suitable singularity at the origin, the limit may be nonzero, but let me assume that the test function $f$ is at least continuous. If $n=1$, this interpretation leads to $\delta(|x|)=\delta(x)$.
If $n=1$, the limit is $$ \lim_{r\to0}\sum_{\pm}f(\pm r) = 2f(0). $$ This suggests the interpretation $\delta(|x|)=2\delta(x)$ in dimension one. On the other hand, if we consider the limit when $r$ approaches zero from the left, the limit is zero.
Unless more context is given, I would say that the distribution $\delta(|x|)$ is zero for $n\geq2$ and doesn't make sense for $n=1$.