Let $x_{i} \sim N(0, \Sigma_{d \times d})$ be $n$ i.i.d. Gaussian vectors, and let $X$ be an $n \times d$ matrix, whose $i$-th row is equal to $x_{i}$. It is then known that $$ \mathbb{E}[(X^{\mathsf{T}}X)^{-1}] = \frac{1}{n - d - 1}\Sigma^{-1} $$ since $(X^{T}X)^{-1}$ follows an Inverse-Wishart distribution.
Now, if we replace the Gaussian distribution, with any other distribution and the same covariance matrix, we would still get $$ \mathbb{E}[X^{T}X] = n\Sigma $$ and so $$ \mathbb{E}[(X^{T}X)^{-1}] \geq \frac{1}{n} \Sigma^{-1}. $$
My question is, in what cases can $\mathbb{E}[(X^{\mathsf{T}}X)^{-1}]$ be computed and does it have to be proportional to $\Sigma^{-1}$? If it cannot be computed, what are known ways to upper bound it? Any references even tangentially related would be much appreciated.