How could I rewrite a formula looking to me too much summarized/simplified?

45 Views Asked by At

On a book, to present me a transformation required during a Principal Component Analysis (PCA), with:

$\mathbf{I}$ : the set of individuals
$\mathbf{i}$ : an index going over one individual
$\mathbf{l}$ : an index going over a second individual
$\mathbf{k}$ : an index going over the variables [coming from a $K$ set not shown in the formula below]

the book tells, while explaining the process of standardizing values and the centered/reduced variables, that $d^2(i, l)$, the Euclidean distance between the individuals $i$ and $l$, shows that the variance of a $k$ variable has relationship with the formula:

$$ s^{2}_{k} = \frac{1}{2I^2}\sum\limits_{i,l}\big(x_{ik} - x_{lk}\big)^2$$

but this formula, for me, isn't pleasant to read.

  • $2I^2$ uses I, a set, but with the underlying goal of extracting its cardinal
    So I should have seen $\mathbf{n_{I}}$ (if acceptable?) or Card(I) here instead, even if its long, I guess.

  • $\sum\limits_{i,l}$ summarizes a double summation.

How could I write this formula more correctly?

1

There are 1 best solutions below

1
On BEST ANSWER

The double summation is fine.

When $I$ is finite, it is common practice to denote $|I|$ the number of elements of $I$.

Second point, I would prefer to separate the indices by a comma.

Third (minor) point. The letter $l$ is easily confused with $1$, or even with $i$ in small characters. To avoid this use $\ell$ instead (\ell in $\LaTeX$). Altogether, you could write your formula as $$ s^{2}_{k} = \frac{1}{2|I|^2}\sum_{i,\ell}\big(x_{i,k} - x_{\ell,k}\big)^2 $$