Understanding the solution to the varimax rotation problem

102 Views Asked by Bumbble Comm At 26 Mar 2026 - 9:37

I'd like to preface this post by saying that this is my first post on stack exchange, so if there is anything to improve, be it redaction or just the structuring of posts, I'm more than willing to learn how to improve.

I've been playing around with implementations of Varimax rotations (as described in [1], [2]) on R and Python ([3]), and trying to understand the mathematics behind it. In particular, it seems the implementations on R and Scikit-Learn([3]) are using the algorithm described in [4]. The optimization problem that has to be solved is $$arg\max_{R} tr(R^{T}Q(R)) = tr\left[R^T.\Phi^T.\left[(((\Phi.R)\circ (\Phi.R)\circ (\Phi.R))-\frac1p (\Phi.R).diag((\Phi.R)^T.(\Phi.R)) \right]\right]$$ where $\Phi$ is a constant matrix of dimensions $p \times k$ and $R$ is an orthonormal rotation matrix (so this is a constrained optimization problem). The (iterative) solution proposed here is:

 1. Start with R = I, where I is the kxk identity matrix
 2. Solve argmax tr(R(Q(R)) with R = I
 3. Calculate the SVD of Q(R): [U,S,V] = svd(Q(R))
 4. Update R as R = UV^{T}, the optimum here is tr(S)
 5. Repeat the above procedure until tr(S) variations fall under the specified tolerance

I understand how to optimize each individual trace ( which is just tr(S)), but I don't understand why updating R like this will make sure that you have an increasing sequence of traces until you find the maximum of the problem. In other words, naming each iterate R as $R_{i}$ with it's corresponding $S_{i}$, why can it be said that $tr(S_{i + 1}) \geq tr(S_{i})$?

Also, I am not sure if I can ask more than one part question or if I should split them on different threads, but if it's OK to do so here, I also am unsure whether this would give a global maximum for the problem. My intuition tells me that it doesn't, but I can't be sure. If it really isn't, sounds like an interesting problem to solve (that is, add some algorithm to try and solve it globally).

Thanks in advance!

Citations:

[1]Kaiser, Henry F., The varimax criterion for analytic rotation in factor analysis, Psychometrika 23, 187-200 (1958). ZBL0095.33603.

[2]Sherin, R. J., A matrix formulation of Kaiser’s varimax criterion, Psychometrika 31, 535-538 (1966). ZBL0152.18705.

[3] https://github.com/scikit-learn/scikit-learn/blob/main/sklearn/decomposition/_factor_analysis.py

[4]https://www.researchgate.net/publication/250026428_Sparse_Modeling_of_landmark_and_texture_variability_using_the_orthomax_criterion_-_art_no_61441G

PD: English is not my first language, so if anything is a bit unclear, I can provide clarification wherever is needed.

Original Q&A

Understanding the solution to the varimax rotation problem

Related Questions in LINEAR-ALGEBRA

Related Questions in MATRICES

Related Questions in STATISTICS

Related Questions in NONLINEAR-OPTIMIZATION

Related Questions in PRINCIPAL-COMPONENT-ANALYSIS

Trending Questions

Popular # Hahtags

Popular Questions