How do you calculate the gradient of nll_loss(log_softmax(A*A*X*W0*W1)) w.r.t A?

164 Views Asked by At

I would like to know how I can calculate the gradient of null_loss(log_softmax(A*A*X*W0*W1)) w.r.t A.

All A, X, W0, W1 are 2D matrices. Even just showing how to calculate the gradient of A*A*X*W0*W1 would also be helpful.

I'm trying to implement a function in pytorch so if you can show how you do it on pytorch, that would be awesome.

Thanks!

1

There are 1 best solutions below

0
On

null_loss(log_softmax()) is same as cross_entropy and the description for cross_entropy can be found here pytorch.org/docs/master/generated/…

But I'm still not sure how to get the gradient w.r.t A