I checked with an example of random matrices and I noticed the sum of the resulting diagonals are indeed the same. Also, I have that $trace (\mathbf{A}^\intercal \mathbf{B}) = trace(\mathbf{B} \mathbf{A}^\intercal)$ by means of the trace theorem. In addition, the elements of $diag(\mathbf{A}^\intercal \mathbf{B})$ can be writen as: $$\mathbf{C}_{jj} = \sum_{i}^{n}a_{ji}b_{ij}.$$ Here is the trick though, to get the trace we'd have to sum over all $j$ and we'd have $$trace (\mathbf{A}^\intercal \mathbf{B}) = \sum_{j}^{n} \sum_{i}^{n}a_{ji}b_{ij}.$$
By exchanging indexes we'd be able to get somewhere, but I am not so sure how to go about it. Any help would be really appreciated!
Trace is invariant under cyclic permutations and transpose:
$Tr(AB) = \sum_{ij} A_{ij} B_{ji} = \sum_{ij} B_{ji} A_{ij} = Tr(BA)$
and
$Tr(A^T) = \sum_{i} A^T_{ii} = \sum_{i} A_{ii} = Tr(A) $.
I'll leave the rest to you.