Suppose that we have a system which can receive different kinds of input (the index $k$ signifies the kind of input $I_k$) and performs a calculation on that input based on the internal parameters of the system which are denoted as $x_i$ in which $i$ is an index for the parameters ranging from 1 to $n$.

This system can be anything but for my problem is a neural network. We want to calculate the lower bound on the estimation error of the parameters given different kinds of input that we can stimulate the system with and finally figure out which kind of input is more suitable for the purpose of parameter estimation. To be specific, the parameters for which I want to calculate the error bound are the weight factors of the neural network but this question is more general. Here is the approach that I took:
1-Calculate the Fisher information matrix (FIM) using:
$[\mathcal{I(\bar{x})}]_{i,j}=E[(\frac{\partial}{\partial x_i}log f(\bar{x},I_k)(\frac{\partial}{\partial x_j}log f(\bar{x},I_k)]$
in which $\bar{x}$ is the vector of the parameters $x_1$ to $x_n$.
2-Invert the FIM.
3-Use the statement of the Cramer-Rao bound for the multivariate case:
$cov_{\bar{x}}(I_k) \geq [\mathcal{I(\bar{x})}]^{-1}$
This basically means that $cov_{\bar{x}}(I_k) - [\mathcal{I(\bar{x})}]^{-1}$ is positive semi-definite. This has a geometric interpretation which is explained here. I rephrase the content of the file below.
We can make an ellipse (ellipsoid) from the inverse of the Fisher information matrix instead of the covariance matrix of the parameters. This will be the smallest error ellipse that an unbiased estimator can achieve.
4-Calculate the volume of the error ellipsoid using the determinant of the inverse FIM and the confidence region that I want to consider. More information can be found here. The ellipsoid volume is given by:
$V = \frac{(K\pi)^{n/2}}{\Gamma(\frac{n}{2}+1)}\sqrt{det([\mathcal{I(\bar{x})}]^{-1})}$
After doing these calculations for different kinds of input $I_1, I_2, I_3$ and the others, I claim that the input for which we can achieve the lowest error ellipsoid volume is the most suitable one for our metrological task.
This kind of criterion which also amounts to maximizing the determinant of the Fisher information matrix is called the D-optimality criterion in system optimization. When I take the case $n=1$ the problem boils down to a single variable parameter estimation problem in which the CRLB is given by:
$var(x) \geq \frac{1}{\mathcal{I(\bar{x})}}$
For this case my results are consistent and I conclude that for a certain types of input $I_{k'}$ a smaller error bound is obtained. Keep in mind that my inputs are probability distributions. For $n>1$ we need to calculate the ellipsoid volume.
Basically if one kind of input can give tighter error bound for $n=1$, given that for $n>1$ the system is composed of the same units only repeated in a network, we should get the optimal result for the same kind of input $I_{k'}$. This happens for some parameters but not for the others and the results vary depending on the efficiency of the system and the intensity of the input (mean of the probability distribution describing the input).
So the question is: Can we characterize the total error using the volume of the error ellipsoids or should we do more? I am not sure if the D-optimality is enough or should I consider other criteria as well. The reason I\m not sure is that I have not seen any mathematical proofs on which types of criteria are needed for which types of systems. Any guidance is highly appreciated.
