I know a set of orthogonal vectors is a set where vectors have a null dot product pairwise, whatever their norms, and a set of orthonormal vectors is a set of orthogonal vectors where all vectors have a norm equal to 1.
Now I'm trying to dig into SVD decomposition $\small X=U \Sigma V^T$ introduced by San José University and read about U and V:
there exist two orthogonal matrices
So I'm trying to understand what are orthogonal matrices.
Wikipedia says:
In linear algebra, an orthogonal matrix, or orthonormal matrix, is a real square matrix whose columns and rows are orthonormal vectors
So Wikipedia implies orthogonal matrix is synonymous for orthonormal matrix, and a matrix cannot be orthogonal if a row or a column is a vector with a norm different of 1.
This site says:
For matrices, an orthogonal matrix has orthogonal rows and columns. This means that the dot product of any two rows or columns is zero
An orthonormal matrix, on the other hand, not only has orthogonal rows and columns but also has orthonormal rows and columns. This means that each row and column is a unit vector
On the other hand this answer here states:
There is no thing as an "orthonormal" matrix
I'm confused about the difference between orthogonal and orthonormal matrices. Can this be clarified:
- Is there a definition for orthogonal matrix and/or orthonormal matrix?
- Can a matrix be orthogonal without being orthonormal?
- Does that depend on whether the matrix is square or not?
Please focus on these three questions, not adding further confusion.
Related, but not helpful: Difference between orthogonal and orthonormal matrices
Edit: in the definition of SVD, “orthogonal matrices” is meant in the usual sense, i.e., as in Wikipedia webpage.
I think that what the Wikipedia page writes is the terminology that most people use and is somewhat standard. Of course, the definition of “orthonormal matrix” given by the website collimator.ai coincides with the standard definition of orthogonal matrix, but I have never encountered the combination of words “orthonormal matrix” in a mathematical context before, and it is definitely not a universally accepted terminology. Nevertheless, people in different areas (geometry, computer science, algebra,…) could use different terminology, so in general it is always best to be careful and refer to the definitions stated in the book/article you are reading. It is unavoidable that people with different backgrounds use different languages.
I wanted to add that the notion of orthogonal matrix given in the website collimator.ai is not invariant under changes of orthonormal basis. For instance, the matrix $\operatorname{diag}(1,2)$ is equivalent to $$ \left[\begin{array}{} 3/2&-1/2\\-1/2&3/2\end{array}\right] $$ via a 45 degree rotation of the basis, if I am not wrong. In the first matrix the columns are orthogonal, while in the second matrix the scalar product between the columns is $-3/2$. This of course does not mean that the definition is “wrong”, but it suggests that the definition of orthogonal matrix given by the website collimator.ai is not that natural as a mathematical definition, and it is maybe useful only in some restricted or more applied settings (e.g., I can imagine this being a reasonable definition in computer science).
On the other hand, the notion of orthogonal matrix as in the Wikipedia page is invariant under an orthonormal change of basis (which is a consequence of the fact that the set of orthogonal matrices equipped with the matrix product is a group). That definition is more natural and universal in a sense, in fact you can generalize it and define orthogonal linear transformations in general vector spaces equipped with a scalar product. So it is not surprising that this definition is important in mathematics and there is a general agreement on the terminology used.
Final note: the word “orthogonal” comes from “right angle”. Matrices that simply have all columns orthogonal but not orthonormal don’t leave all right angles invariant (for instance, take again $A:=\operatorname{diag}(1,2)$, and consider the vectors (1,1),(1,-1). They are orthogonal, but their images under the matrix $A$ are not orthogonal). In fact, any linear transformation that leaves all right angles invariant is an orthogonal transformation (in the wikipedia sense) multiplied by a non-zero scalar. So maybe a more natural term for orthogonal matrices (as in the Wikipedia page) would be “unitary orthogonal” matrices/transformations, that is, transformations that leave right angles invariant and also leave unitary vectors unitary (so there is no free choice anymore of multiplication by a non-zero scalar). The latter condition actually implies the first one in the case of linear transformations and makes the orthogonal term redundant (in fact, in the complex case these kinds of matrices/transformations are simply called “unitary”). After all, maybe “orthonormal” would be a more accurate word to describe orthogonal matrices, but the words “orthogonal matrix” and “orthogonal group” have become standard and widely accepted by now (or at least, to my experience, this is the case in pure mathematics).