I am reading the paper "Detection of Signals by Information Theoretic Criteria" by Wax and Kailath.
In this paper, a vector $\Theta$ is defined according to:
$\Theta = (\lambda_1, ..., \lambda_k, \sigma^2, V_1^T, ..., V_k^T)$ where $\{\lambda_i\}$ are the eigenvalues of a covariance matrix (which is symmetric so the eigenvalues are real), $\{V_i\}$ are complex eigenvectors of length $p$, and $\sigma^2$ is a real number.
It is said that there are $k+1+2pk$ parameters in this vector, which makes sense because there are $k$ eigenvalues; $1$ parameter $\sigma^2$; and there are $k$ vectors of length $p$ with $2$ parameters in each component (accounting for the real and the imaginary part). So far, so good.
Now I get confused. The authors constrain the eigenvectors to have unit norm and to be mutually orthogonal, and say that the reduction of the degrees of freedom is $2k$ due to normalization and $2 \frac{1}{2}k(k-1)$ due to mutual orthogonalization. I don't understand why.
I would have said that each normalization equation reduces $1$ degree of freedom, so $k$ of them (one for each eigenvector) would reduce $k$ degrees, instead of $2k$. Also, I see that the mutual orthogonalization of $k$ vectors yields $\frac{1}{2}k(k-1)$ equations, and the same argument as before makes me think that this is also the number of degrees reduced due to this condition. You see that in both cases I am missing a factor of $2$ , do you know where does it come from?
I think by normalization something more is intended. To see this precisely, see that if $v$ is an eigenvector of $A$ then for any $c\in\mathbb C$, $cv$ is also an eigenvector. Therefore if $v$ is non-zero and unit norm, then one can make an entry of the vector $v$ always to be real by proper normalization and still keeping the unit norm. After this there are only $2(p-1)$ free parameters. So the normalization is not by $\|v\|_2$ but by $\|v\|_2e^{-i\varphi}$ where $re^{i\varphi}$ is a non-zero entry of $v$.
For the orthogonal condition, this is much simpler to see. Just consider the following example:
$$ (a+ib)(c+id)^*=(ac+bd)+i(bc-ad). $$ If the above product is zero we loose two degrees of freedom one for the real part and another one for the complex part. Therefore each orthogonal condition takes 2 degrees of freedom.