I am interested in the approximation speed of the classic 1-hidden-layer unbounded width Neural Network as it is defined in the following paper.
- Kurt Hornik, Maxwell Stinchcombe, Halbert White, Multilayer feedforward networks are universal approximators, Neural Networks, Volume 2, Issue 5, 1989.
The 1-hidden-layer model approximates any continuous function on a compact set w.r.t. the $\operatorname{sup}$-norm. He gives a short comment, that further research has to be done on the question of approximation speed (his proof uses Stone-Weierstrass). However I am having trouble finding anything on this subject. Can someone help out here?
Note: There are some results on the approximation speed by Andrew R. Barron, but these are w.r.t. the $L^2$-norm which doesn't help here I think.
The reference I could find which I think is closest to what you're looking for is this surprisingly short (2 pages) note by Bailey and Telgarsky : General Bounds for 1-Layer ReLU approximation (pdf link), which was published as part of the SampTA 2019 conference.
In it, the authors prove that for any function $f$ belonging in a certain class of functions defined on the Euclidean unit ball, the following best 1-layer approximation $\bar f$ satisfies $$\|f-\bar f\|_\infty \le O\left(\frac{1}{\sqrt n}\right) $$ Where $n$ is the width of the network. The proof is not very detailed, but it seems that it wouldn't be so hard to put the pieces back together. Interestingly, in the same note, they also show that if $f$ is $c$-strongly convex on a convex subset of its domain, then the following lower bound holds : $$\|f-\bar f\|_\infty \ge O\left(\frac{1}{n^2}\right) $$
Another paper that will be of interest to you is Uniform approximation rates and metric entropy of shallow neural networks (2022) by Ma, Siegel and Xu, in which the authors provide in Theorem 5 uniform approximation rates for shallow neural networks with ReLU$^k$ activation.
Lastly, although they do not provide directly rates of approximation, you may want to have a look at the papers A closer look at the approximation capabilities of neural networks (2020) by Kai Fong Ernest Chong and Minimum width for universal approximation (2020) by Park et al., which provide estimates on the minimum width needed for a shallow network to uniformly approximate a function $f$ with precision $\varepsilon$.