I am working through the book Geometric Deep Learning (https://arxiv.org/abs/2104.13478) and have hit the following formula (Chapter 3.5, page 27).
If $f$ is linear and $\mathfrak G$-invariant and $\mu(\mathfrak g)$ is Haar measure of the group $\mathfrak G$, then for all $x\in \mathcal{X}(\Omega)$, $$f(x) = \frac{1}{\mu(\mathfrak G)}\int_{\mathfrak G} f(\mathfrak g.x)d\mu(\mathfrak g) = f \left(\frac{1}{\mu(\mathfrak G)}\int_{\mathfrak G}(\mathfrak g.x)d\mu(\mathfrak g)\right)$$ which indicates that $F$ only depends on $x$ through the $\mathfrak G$-average $Ax= \frac{1}{\mu(\mathfrak G)}\int_{\mathfrak G} f(\mathfrak g.x)d\mu(\mathfrak g)$.
I have two quick questions.
It doesn't seem obvious to me that the function $f$ being linear and $\mathfrak G$-invariant allows the integral $\int_{\mathfrak G}\cdot \, d\mu(\mathfrak g)$ to just go inside(?) the function $f$. The authors gave no explation as to how it is done. How is the formula derived?
I don't know how this operation of integral $\int_{\mathfrak G}\cdot \, d\mu(\mathfrak g)$ 'going inside' the function $f$ is called. Would 'commute' be appropriate term? (For example, the function $f$ and integral $\int_{\mathfrak G}\cdot \, d\mu(\mathfrak g)$ commute with each other.)
$$\int_G F(\phi(g))\, d\mu(g) = F\left(\int_G \phi(g)\,d\mu(g)\right).$$
The claim is true for $\phi=\mathbf{1}_A$ the indicator function of a measurable subset $A\subseteq G$ (by definition $\mathbf{1}_A(g)=1$ if $g\in A$ and $0$ otherwise); indeed we have
\begin{align*} \int_G F(\mathbf{1}_A(g))\, d\mu(g) =\int_A F(1) \,d\mu(g) = F(1) \mu(A) \stackrel{(\ast)}{=} F(\mu(A)) =F\left(\int_G \mathbf{1}_A(g)\,d\mu(g)\right), \end{align*}
where we used the linearity of $F$ for the equality marked with $(\ast)$. Next, again by the linearity of $F$ the claim is true for simple measurable functions (functions that are finite linear combinations of indicator functions, i.e. functions of the form $\sum_{i=1}^nc_i\mathbf{1}_{A_i}$). Finally one can approximate any $\phi\in L^2(G,\mu;\mathcal{H})$ by an increasing sequence of simple measurable functions, and the general result follows from the continuity of $F$.
In our case the Hilbert space $\mathcal{H}=\mathcal{X}(\Omega;\mathcal{C})$ is the space of square-integrable signals (according to p.11; where $\mu$ is not the same as the Haar measure on $G$), and $\phi:G\to \mathcal{H},$ $g\mapsto g. x$ for some anonymous $x\in\mathcal{X}(\Omega;\mathcal{C})$. Consequently there are some subtleties regarding the simple approximation and the continuity of $F$. The $G$-invariance of $F$ comes into play in the equality
$$F(x)=\int_G F(g.x) \,d\mu(g).$$
(Another subtlety is that it seems to be assumed that the group $G$ admits a finite Haar measure, or else there are again further subtleties involved with interpreting $\dfrac{1}{\mu(G)}$ properly.)
$$A^\alpha:\mathcal{X}(\Omega;\mathcal{C})\to \mathcal{X}(\Omega;\mathcal{C}), x\mapsto \dfrac{1}{\mu(G)}\int_G K^\alpha_g(x)\,d\mu(g).$$
(Note that for any $x$, $A^\alpha(x)$ is $K^\alpha$-invariant, and if $x$ were $K^\alpha$-invariant to begin with then $A^\alpha(x)=x$; so that $A^\alpha$ is a projection operator onto the subspace of $K^\alpha$-invariant signals.)
(Along these lines, there seems to be a typo in the last paragraph on p.27: $A\circ U$ ought to take values in $\mathcal{X}(\Omega,\mathcal{C}'')$, instead of $\mathcal{C}''$.)
With this notation, the above claim becomes, for $F$ (continuous and) linear and $K^\alpha$-invariant:
$$A^\alpha( F(\phi) ) = F( A^\alpha(\phi) ),$$
that is,
$$A^\alpha\circ F=F\circ A^\alpha.$$