Let $X$ be a Banach space, $\mathcal B(X;Y)$ denotes the set of bounded linear operators $X\to Y$. Consider the inverting map $I:U\subset\mathcal B(Y;X)\to \mathcal B(X;Y)$ defined by $I(T) = T^{-1}$, where $U$ is the set where this makes sense. It is known, e.g. here, that $I$ is (Frechet) differentiable and $$ I'(T)[A] = -T^{-1}AT^{-1}, $$ here $I'(T)$ is viewed as an element of $\mathcal B(\mathcal B(Y;X);\mathcal B(X;Y))$.
How do we prove that the $k^{\text{th}}$-derivative of $I$ is the $k$-multilinear map $$ (A_1,\dots,A_k) \mapsto (-1)^{k} \sum_{\sigma\in S_k} T^{-1}A_{\sigma(1)}T^{-1}\dots T^{-1}A_{\sigma(k)} T^{-1}, $$ where the sum is over all permutations $\sigma$ of $\{1,\dots,k\}$?
This formula is given in a book by Hormander without a proof (as usual). It looks like a symmetrization of the higher order terms in the Taylor expansion of $I$ (some details are seen in this thread).
To obtain higher order derivatives, I tried to differentiate $I'$ by writing $I' = -M\circ I$, where $M(T)[A] = TAT$, and repeatedly apply chain rule. However, the higher derivatives of $M$ gets ugly really fast (or that I don't know a clean way to write it down). Is there a nice way to prove this result?
For small enough $t$ we have the power series expansion
$$ f(T + tS) = (T + tS)^{-1} = (T(1 - (-tT^{-1}S)))^{-1} = T^{-1} \sum_{k=0}^{\infty} (-1)^k (T^{-1}S)^k t^k. $$
Using it we see that,
$$ (D^k f)|_{T}(S, \dots, S) = \left( \frac{d}{dt} \right)^k f(T + tS)|_{t=0} = k! (-1)^k (T^{-1} S)^k. $$
Now we know that $D^k f|_{T}(S_1,\dots,S_k)$ is symmetric and uniquely determined from $D^k f|_{T}(S,\dots,S)$ by the polarization identity so we can just guess that $$ D^k f|_{T}(S_1,\dots,S_k) = (-1)^k \sum_{\sigma \in S_k} T^{-1} S_{\sigma(1)} \cdots T^{-1} S_{\sigma(k)} $$
and since this is symmetric in $S_1,\dots,S_k$ and coincides with our expression when $S_1 = \dots = S_k = S$, our guess must hold.