I was studying this document. https://j-towns.github.io/papers/svd-derivative.pdf
And one of the section the author claims...

Why does that equality hold? And how come? what is the reasoning behind it?
I am new to this topic so sorry if I asked a wrong question.
The trace is used to construct the Euclidean scalar product for matrices, as $$ tr(\bar U^TdU)=\sum_{ij}\bar U_{ij}dU_{ij} $$ where the last expression is what is required to connect gradient and derivative, backwards and forward mode. Up to this point, this is not specific to the SVD, just general AD formalism.