If I have a neural net with $L$ hidden layers, and $d^l$ neurons in each layer. Which functions can be learned if I use the identity activation function? Is this all the linear functions group?
And which functions can be learned if I limit the $k$ hidden layer to have at most $d'$ neurons? Does it matter and change the previous answer?