Solving linear algebra equality (gradient-based optimization)

58 Views Asked by Bumbble Comm At 03 Apr 2026 - 12:55

Context:

I am reading the paper "Gradient-based meta-learning with learned layerwise metric and subspace" (arxiv), and am having some trouble with how one of the equalities on page 5.

The authors state the following (where bold font denotes a matrix and non-bold font denotes a vector):

\begin{equation} y = \boldsymbol{TW}x = \boldsymbol{A}x \end{equation}

They then go on to say that their new learning rule update is given as:

\begin{equation} y_{new} = \boldsymbol{T}(\boldsymbol{T}^{-1}\boldsymbol{A} - \alpha \nabla_{\boldsymbol{T}^{-1}\boldsymbol{A}}\mathcal{L_T})x \end{equation}

which they say is equal to the following:

\begin{equation} y_{new} = y - \alpha (\boldsymbol{TT}^{\intercal})\nabla_{\boldsymbol{A}}\mathcal{L_T}x \end{equation}

Question:

However I don't quite understand how this is derived, in particular I am unsure as to where the term $(\boldsymbol{TT}^{\intercal})\nabla_{\boldsymbol{A}}$ comes from. Can someone please give me some guidance on how to resolve this?

My attempt:

Here is my attempt, but I am unsure how to derive the RHS.

\begin{align} y_{new} &= \boldsymbol{T}(\boldsymbol{T}^{-1}\boldsymbol{A} - \alpha \nabla_{\boldsymbol{T}^{-1}\boldsymbol{A}}\mathcal{L_T})x \\ &= \boldsymbol{T}\boldsymbol{T}^{-1}\boldsymbol{A}x - \alpha \boldsymbol{T}\nabla_{\boldsymbol{T}^{-1}\boldsymbol{A}}\mathcal{L_T}x \\ &= \boldsymbol{I}y - \alpha \boldsymbol{T}\nabla_{\boldsymbol{T}^{-1}\boldsymbol{A}}\mathcal{L_T}x \\ &= y - \alpha \boldsymbol{T}\nabla_{\boldsymbol{T}^{-1}\boldsymbol{A}}\mathcal{L_T}x \\ &= \dots \\ \end{align}

Thank you very much for your time :)

Original Q&A

Solving linear algebra equality (gradient-based optimization)

Related Questions in CALCULUS

Related Questions in LINEAR-ALGEBRA

Related Questions in OPTIMIZATION

Related Questions in MACHINE-LEARNING

Trending Questions

Popular # Hahtags

Popular Questions