It is common knowledge that, for any function $f: D \subset \mathbb{R} \to \mathbb{R}$, there is a regular neural network capable of approximating $f$ with an arbitrarily small error. Finding that network is a different story, though.
My question would be, can we set an upper-bound for the number of neurons it would take a network to learn a function like $f(x)=x^2, x \in [a,b]$ making no errors greater than a fixed $\epsilon > 0$?
For the sake of simplicity, let's assume we are sticking to one-layer-deep networks that use the rectifier as its activation function
Claim 1: For any $N$ points $x_1 <x_2, \dots < x_N$ in $\mathbb R$ and any $N$ values $y_1,y_2, \dots, y_N$ in $\mathbb R$ there exists a rectifier network with $N-1$ neurons that linearly interpolates these values. (I think you can put the proof of this together yourself.)
For $a <x_2, \dots, x_{N-1} < b$ any piecewise linear interpolator with the values $a^2, x_2^2, \dots, x_{N-1}^2, b^2$ makes a pointwise error that for each $x \in [x_i,x_{i+1}]$ is given by $$ |x^2 - x_i^2 - (x - x_i) \frac{x_{i+1}^2-x_i^2}{x_{i+1}-x_i}| = |(x-x_i)(x+x_i) - (x-x_i)(x_{i+1} + x_{i})| = |x-x_i| |x-x_{i+1}| $$ We can estimate the term above by $\max_{i=2,\dots, N} |x_{i} - x_{i-1}|^2$.
Choosing the points $x_i = a + (i-1)(b-a)/(N-1)$ we get that the error of the piecewise linear interpolator is $(b-a)^2/(N-1)^2$.
Combining with Claim 1, we see that with $N$ neurons, an upper bound on the error is $(b-a)^2/N^2$.
If you allow more layers, then you can significantly improve this approximation result. For a fixed number of layers $L$ Lemma A.3 of https://arxiv.org/pdf/1709.05289.pdf gives an estimate that you could use. If you allow the number of layers to grow indefinitely, then Proposition 2 of https://arxiv.org/abs/1610.01145 can be used.