Beginning with a 4*3 matrix:
5 4 -1
2 3 -3
3 4 -4
1 3 -2
I have to perform four manipulations on it, which I did by hand. I wanted to ask if my thinking and/or calculations seem correct here, and if there is a way my final answers can be double-checked in a less meticulous way (easy-to-use software etc.). (Because I think there is an error at the end, which I will show below):
1) Calculate the sample variance-covariance matrix:
Here, for the non-diagonals, I calculate the covariance values using the formula:
cov(x,y)=(sum(i -->n)(xi-xmean)(yi-ymean))/(n-1)
and for the diagonals, I caculate the variance values using the formula:
var(x)=(sum(i -->n)(xi-xmean)^2)/(n-1)
which lead me to an "answer" of:
2.916 4.042 -1.458
4.042 0.333 -2.917
-1.458 -2.917 1.666
2) Calculate the sample correlation matrix:
I used the formula:
r(x,y)=1/(n-1) * sum(x-xmean/Sx)(y-ymean/Sy), where Sx is the square root of variance
which lead me to an "answer" of:
1.000 0.845 0.378
0.845 1.000 0.000
0.378 0.000 1.000
3) Calculate the standardized scores for these data:
I used the formula:
Zx = (x-xmean)/Sx
which lead me to an "answer" of:
1.318 0.866 1.162
-0.439 -0.866 -0.387
0.146 0.866 -1.162
-1.025 -0.866 -0.387
4) Calculate the Euclidean distance between all observations, and write the output as a matrix, for both the raw data and standardized data.
I used the formula:
d = square root(sum(xi-yi)^2)
and got original matrix Euclidean distance:
0.000 3.742 3.606 4.243
3.742 0.000 1.732 1.414
3.606 1.732 0.000 3.000
4.243 1.414 3.000 0.000
and standardized matrix Euclidean distance:
0.000 2.913 2.603 3.299
2.913 0.000 1.986 0.586
2.603 1.986 0.000 2.229
3.299 0.586 2.229 0.000
The reason why I think my thinking/calculations may have a problem is that the standardized matrix Euclidean distance does not seem consistent to the original matrix Euclidean distance (The value in row 2, column 3, is the only value less in the original matrix than in the standardized matrix) - is this a problem? However, the order of magnitude is consistent between the original and standardized matrices (if the cells are lined from smallest to largest values, say, they are in the same order).
Thanks!