I know we use normal linear regression for modeling functions on datasets, but can someone explain how neural networks help in approximating more complex functions, especially when they are nonlinear?
Intuitively, what does each layer adds to the whole process of approximation?
What I am looking for is an explanation of how neural networks approximate functions, and not a comparison with the biological neurons.
There is some explanation for this in Duda and Hart's Pattern Recognition book. Look at section "6.2.2 Expressive power of multilayer networks". Directly quoting from there:
...