I want to solve the following optimization problem , if R is a positive semi-definite matrix and D is diagonal matrix , both of size T x T then we should solve
${\underset{D}{min}}$ trace[$(R + D)^{\dagger}$]
such that trace(D) $\leq$ P
here $A^{\dagger}$ = Pseudo-inverse of A
Any help using matrix calculus or lagrange multiplier or any other method is welcome
$ \def\l{\left} \def\r{\right} \def\lr#1{\l(#1\r)} \def\fracLR#1#2{\lr{\frac{#1}{#2}}} \def\o{{\tt1}} \def\a{\alpha} \def\b{\beta} \def\d{\delta} \def\p{\partial} \def\trace#1{\operatorname{Tr}\lr{#1}} \def\diag#1{\operatorname{diag}\lr{#1}} \def\Diag#1{\operatorname{Diag}\lr{#1}} \def\grad#1#2{\frac{\p #1}{\p #2}} \def\c#1{\color{red}{#1}} $To avoid confusion with the differential operator, rename the variable $D\to A$.
Then consider the case when $R=R^T$ and $A_{kk}>0$, then we can dispense with the pseudoinverse and write the objective function as $$\eqalign{ \phi &= \trace{(R+A)^{-1}} \\ d\phi &= -(R+A)^{-2}:dA \\ }$$ where $(:)$ denotes the Frobenius product, which is a convenient notation for the trace $$\eqalign{ A:B &= \sum_{i=1}^m\sum_{j=1}^n A_{ij}B_{ij} \;=\; \trace{AB^T} \\ A:A &= \big\|A\big\|^2_F \\ }$$
Introduce the unconstrained vector $(b)$ and its associated diagonal matrix $(B)$ and scalar magnitude $(\b)$. Then use it to construct a matrix which satisfies the constraints (rename the scalar $P\to\a$). $$\eqalign{ B &= \Diag{b} \quad&\implies\quad \trace{B}=\b \\ \b &= \o^Tb \quad&\implies\quad d\b = \o^Tdb = \trace{dB} \\ A &= \fracLR{\a}{\b} B \quad&\implies\quad \trace{A}=\a \\ a &= \fracLR{\a}{\b} b \quad&\implies\quad \o^Ta=\a \\ }$$ Calculate the differential of the constructed matrix $$\eqalign{ da &= \fracLR{\a\b\,db-\a b\,d\b}{\b^2} \\ &= \frac{\a}{\b^2}\lr{\b I-b\o^T} db \\ }$$ and substituted it into the diagonalized differential of the function to obtain the unconstrained gradient $$\eqalign{ d\phi &= -\diag{(R+A)^{-2}}:da \\ &= -g:da \\ &= -\frac{\a}{\b^2}g:\lr{\b I-b\o^T} db \\ &= +\frac{\a}{\b^2}\lr{\o b^Tg-\b g}: db \\ \grad{\phi}{b} &= \frac{\a}{\b^2}\lr{\o b^Tg-\b g} \\ }$$ Setting the gradient to zero yields $$\eqalign{ \lr{\o b^T}g & = \b g \\ }$$ This is an eigenvalue equation for a rank-$\o$ matrix which has a single non-zero eigenvalue corresponding to the eigenvector $\o$ (the all-ones vector). Therefore $g$ is equal to this eigenvector (or is a scalar multiple of it). $$\eqalign{ \o &= \lambda g = \diag{\lr{R+A}^{-2}} \\ I &= \lambda \Diag{\diag{\lr{R+A}^{-2}}} \\ &= \lambda I\odot\lr{R+A}^{-2} \\ }$$ where $(\odot)$ denotes the elementwise/Hadamard product.
This is a nonlinear equation, but it can be rearranged into an interative relationship (which hopefully converges) $$\eqalign{ A_0 &= I \\ Z_+ &= A\odot\lr{R+A}^{-2} \\ A_+ &= \frac{P\,Z_+}{\trace{Z_+}} \qquad&\big({\rm enforce\,the\,constraint}\big) \\ }$$ Initializing $A$ to a random diagonal matrix might improve its convergence prospects.
The zero eigenvectors of $(\o b^T)$ are also possible extremal solutions. You can explore them by choosing $g$ to be perpendicular to $b$ (and therefore perpendicular to $a$).
This leads to a very different system of nonlinear equations $$\eqalign{ f(a) &= a^T\diag{\big[R+\Diag{a}\big]^{-2}} \;\doteq\; 0 \\ }$$ which could be solved using Newton's Method or a Quasi-Newton method.