So today I was watching a lecture that is taught by Ian Goodfellow about the adversarial attacks of the deep network. He made an interesting point I don't really understand. He said that for a deep neural network, since the activation function we mostly used is Relu, so the model becomes mostly linear. But the parameter of the network is not linear since the weight we multiply is nonlinear. I don't understand why he said that since when forward propagation we are going through those nonlinear calculations as well. so although the activation function is linear, when we used it to do the task, the model will be nonlinear as well? The lecture link is right here https://www.youtube.com/watch?v=CIfsB_EYsVI&list=PL3FW7Lu3i5JvHM8ljYj-zLfQRF3EO8sYv&index=16
Anyone understand this and can clarify this for me? Thanks
For most of the actual activation values at neurons, the neuron is operating in its linear range.
That's all it means.