How do you calculate the gradient of nll_loss(log_softmax(AAXW0W1)) w.r.t A?

165 Views Asked by Bumbble Comm At 11 May 2026 - 12:34

I would like to know how I can calculate the gradient of null_loss(log_softmax(A*A*X*W0*W1)) w.r.t A.

All A, X, W0, W1 are 2D matrices. Even just showing how to calculate the gradient of A*A*X*W0*W1 would also be helpful.

I'm trying to implement a function in pytorch so if you can show how you do it on pytorch, that would be awesome.

Thanks!

There are 1 best solutions below

Bumbble Comm On 24 Jun 2020 - 2:04

null_loss(log_softmax()) is same as cross_entropy and the description for cross_entropy can be found here pytorch.org/docs/master/generated/…

But I'm still not sure how to get the gradient w.r.t A