In the supplemental material of A machine learning paper, the authors evaluate an integral over $m$ dimensions by converting it to an integral over two dimensions. The original integral to be evaluated is:
$$ k_n(\mathbf{x}, \mathbf{y}) = 2 \int_{\mathbb{R}^m} \mathbf{dw} \frac{e^{- \Vert \mathbf{w} \Vert^2/2}}{(2\pi)^{d/2}} \Theta(\mathbf{w} \cdot \mathbf{x})\Theta(\mathbf{w}\cdot\mathbf{y})(\mathbf{w}\cdot\mathbf{x})^n (\mathbf{w}\cdot\mathbf{y})^n, \ \ \ \ \mathbf{x}, \mathbf{y}, \mathbf{w} \in \mathbb{R}^m$$
and $\Theta$ is the heaviside (step) function. They provide the result in the main text:
$$ k_n(\mathbf{x}, \mathbf{y}) = \frac{1}{\pi} \Vert \mathbf{x} \Vert^n \Vert \mathbf{y} \Vert^n J_n(\theta),$$ with $$ \theta = \cos^{-1}\Big( \frac{\mathbf{x} \cdot \mathbf{y}}{ \Vert \mathbf{x} \Vert \Vert \mathbf{y} \Vert} \Big),\\ J_n(\theta)=(-1)^n(\sin\theta)^{2n+1}\Big( \frac{1}{\sin\theta} \frac{\partial}{\partial \theta}\Big)^n\Big( \frac{\pi - \theta}{\sin \theta }\Big).$$
In the supplemental material they explain how the integral was evaluated:
In this appendix, we show how to evaluate the multidimensional integral in eq. (1) for the arc-cosine kernel. Let $\theta$ denote the angle between the inputs $\mathbf{x} \land \mathbf{y}$. Without loss of generality, we can take $\mathbf{x}$ to lie along the $w_1$ axis and y to lie in the $w_1w_2$ -plane. Integrating out the orthogonal coordinates of the weight vector $\mathbf{w}$, we obtain the result in eq. (3) where $J_n(\theta)$ is the remaining integral:
$$ J_n(\theta)=\int dw_1dw_2e^{-0.5(w_1^2+w_2^2)}\Theta(w_1)\Theta(w_1\cos\theta+w_2\sin\theta)w_1^n(w_1\cos\theta+w_2\sin\theta)^n.$$
How does one interpret the quoted text geometrically? I'm having trouble understanding how $m$ dimensions can be reduced down to $2$ without loss of generality. Furthermore, what are $w_1$ and $w_2$? The authors do not explicitly describe their meaning. Can the integral simplification be explained in terms of a substitution (e.g. something like $\mathbf{x}\cdot\mathbf{w}=\Vert\mathbf{w}\Vert\Vert\mathbf{x}\Vert\cos(\theta_{wx})=w_1$)?
First of all, $w_1$ and $w_2$ are simply the first two components of the vector $\mathbf{w}$ which you are integrating on. What the authors refer to as the orthogonal coordinates are the last $m-2$ components of $\mathbf{w}$.
Now, since the input for the function is only two vectors $\mathbf{x}$ and $\mathbf{y}$, we can always, without loss of generality, rotate our coordinates such that $\mathbf{x}$ lies along the first coordinate, that is,
$\mathbf{x} = (\Vert\mathbf{x}\Vert,0,\ldots,0).$
Furthermore, we can now perform another rotation, as long as we leave the first coordinate unchanged, such that $\mathbf{y}$ is on the plane determined by the first and second coordinates, that is,
$\mathbf{y} = (\Vert \mathbf{y} \Vert \cos\theta ,\Vert \mathbf{y} \Vert \sin\theta,0,\ldots,0),$
where $\theta$ is the angle between $\mathbf{x}$ and $\mathbf{y}$.
Now, in our integral we integrate over an $m$-dimensional vector $\mathbf{w}$. However, since the integrand only depends on $\mathbf{x}$ and $\mathbf{y}$ through the scalar products $\mathbf{w\cdot x}$ and $\mathbf{w\cdot y}$, the last $m-2$ components of $\mathbf{w}$ are irrelevant; the scalar products do not care about the last $m-2$ components, since they are all zero for $\mathbf{x}$ and $\mathbf{y}$.
In other words, the reduction in dimensions follows from the fact that the integrals over each of the last $m-2$ components vanish and thus may be disregarded.
We have reduced the integral to an integral over only the first two components of $\mathbf{w}$, and thus
$\mathbf{w} = (w_1,w_2,0,\ldots,0).$
We can now easily calculate the scalar products and find
$\mathbf{w\cdot x} = \Vert \mathbf{x} \Vert w_1,$
$\mathbf{w\cdot y} = \Vert \mathbf{y} \Vert (w_1 \cos\theta + w_2 \sin\theta),$
which lead to the solution presented in the paper.