Why Was Backprop Invented?

216 Views Asked by At

I'm currently researching artificial neural networks and I keep wondering why do we use "backpropagation" to train a neural network.

An ANN is basically just a very large and complex function $f(\mathbf{input};\mathbf{w})$ where $\mathbf{input}$ and $\mathbf{w}$ are vectors $\mathbb{R}^n$ which contain the initial input sequence and all weights of all hidden and output layers of the network.

Therefore, why don't we just optimize $E(\mathbf{expected}, f(\mathbf{input};\mathbf{w}))$ with respect to $\mathbf{w}$ using almost any function optimization method?

Why was the "back propagation of errors" and the layered structure invented and used?