Let $F(A)$ be a matrix-valued function, operating on real-valued matrix $A \in \mathbb{R}^{m, n}$ that applies a scalar function $f(\lambda)$ on the singular values of $A$. That is, suppose $A$ has the following singular value decomposition: $$ A = U \Sigma V^\top, $$ with $U, V$ being orthogonal and $\Sigma$ being diagonal matrices, then $$ B = F(A) = U F(\Sigma) V^\top, $$ where $F(\Sigma)$ is computed by applying $f$ entry-wise on the diagonal elements of $\Sigma$. Let $g$ be a scalar-valued function that depends on the matrix $B$.
Question: How do we find $\dfrac{\partial g(B)}{\partial A}$? In this question, $\dfrac{\partial g(B)}{\partial A} \in \mathbb{R}^{m,n}$ is a matrix whose $(i,j)-$entry contains the value $\dfrac{\partial g(B)}{\partial A_{i,j}}$. Also, I'm looking for (if there is any) a closed-form expression for this, and not just a procedure to compute the partial derivatives.
$ \def\bbR#1{{\mathbb R}^{#1}} \def\b{\beta}\def\g{\gamma} \def\s{\sigma}\def\S{\Sigma}\def\e{\varepsilon} \def\l{\lambda}\def\p{\partial} \def\L{\left}\def\R{\right} \def\LR#1{\L(#1\R)} \def\vecc#1{\operatorname{vec}\LR{#1}} \def\diag#1{\operatorname{diag}\LR{#1}} \def\Diag#1{\operatorname{Diag}\LR{#1}} \def\trace#1{\operatorname{Tr}\LR{#1}} \def\rank#1{\operatorname{rank}\LR{#1}} \def\qiq{\quad\implies\quad} \def\grad#1#2{\frac{\p #1}{\p #2}} \def\c#1{\color{red}{#1}} \def\CLR#1{\c{\LR{#1}}} \def\fracLR#1#2{\LR{\frac{#1}{#2}}} \def\gradLR#1#2{\LR{\grad{#1}{#2}}} \def\bx{\boxtimes} $Assume that the SVD decomposition has distinct singular values $\{\s_k\}$ $$\eqalign{ A &= USV^T = \sum_{k=1}^r \s_k u_kv_k^T \\ A &\in\bbR{m\times n} \qquad U \in\bbR{m\times r},\; S \in\bbR{r\times r},\; V \in\bbR{n\times r} \\ r &= \rank{A} \\ }$$Let's rename the function $(f,g)\to(\b,\g),\,$ so that we can write the mnemonic equations $$\eqalign{ B &= \b(A) = U\b(S)V^T \quad &\{{\rm matrix\;function}\} \\ \g &= \g(B) \quad &\{{\rm scalar\;function}\} \\ }$$ and for typing convenience, define the variables $$\eqalign{ s &= \diag S \quad &\{{\rm vector\;of\;singular\;values}\} \\ p &= \b(s) \qquad &\{{\rm function\;applied\;elementwise}\} \\ q &= \b'(s) \qquad &\{{\rm derivative\;applied\;elementwise}\} \\ P &= \b(S) \,= \Diag p \;& \\ Q &= \b'(S)\!= \Diag q \\ \\ u_k &= U\e_k \\ v_k &= V\e_k \quad &\{\e_k\,{\rm are\;the\;standard\;basis\;vectors}\} \\ G &= \grad{\g}{B} \quad &\{{\rm gradient\;of\;}\g\;{\rm is\;\c{known}}\} \\ g &= \vecc G \\ b &= \vecc B \\ K &= {V\bx U} \quad &\{{\rm Khatri-Rao\;product}\} \\ \l_k &= g^TKQ\e_k \\ }$$ Use the column-wise Khatri-Rao product to expand $\vecc B$ and calculate its differential. $$\eqalign{ B &= U\,\Diag{p}\;V^T \\ b &= Kp \\ db &= K\,\c{dp} \\ &= K\c{Q\,ds} \\ }$$ Substitute this into the differential of $\g$ $$\eqalign{ d\g &= G:dB \\ &= g^T\c{db} \\ &= g^T\c{KQ\,ds} \\ }$$ This post provides a formula for the gradient of the singular values
$$\eqalign{ d\s_k &= u_k v_k^T:dA \\ s &= \sum_{k=1}^r \e_k\star \s_k \\ ds &= \sum_{k=1}^r \e_k\star d\s_k \;=\; \LR{\sum_{k=1}^r \e_k\star u_k v_k^T}:dA \\ }$$ which yields the desired gradient $$\eqalign{ d\g &= g^TKQ\,ds \\ &= \LR{\sum_{k=1}^r\CLR{g^TKQ\e_k}\LR{u_k v_k^T}}:dA \\ &= \LR{\sum_{k=1}^r\c{\l_k} u_k v_k^T}:dA \\ \grad{\g}{A} &= \sum_{k=1}^r {\l_k u_k v_k^T} \;=\; ULV^T \\ }$$ where $L$ is a matrix whose diagonal elements are the $\l_k$ values.