Projection onto the set of orthogonal matrices

519 Views Asked by At

Let $\mathcal{O}^{n}$ be the set of $n \times n$ orthogonal matrices

$$ \mathcal{O}^{n} := \left\{ X \in \mathbb{R}^{n \times n} \mid {X}^{T} X = I_n \right\} $$

I would like to solve the following optimization problem. Given the matrix $Y \in \mathbb{R}^{3 \times 3}$,

$$\begin{align*} \arg\min_{X} \quad & d \left( X, Y \right) \\ \text{subject to} \quad & X \in \mathcal{O}^{3} \end{align*}$$

For my purposes, the type of distance $d$ in the objective does not matter, as long as it indeed a distance-like function (e.g., Bregman distance, norm-squared, etc). Is there a known algorithm for solving such problem for some function $d$?

1

There are 1 best solutions below

0
On BEST ANSWER

I will solve the problem for the Frobenius Norm:

$$ \begin{align*} \arg \min_{X} \quad & {\left\| X - Y \right\|}_{F}^{2} \\ \text{subject to} \quad & X \in \mathcal{O}^{n} \end{align*} $$

By definition of the Frobenius Norm:

$$\begin{align*} {\left\| X - Y \right\|}_{F}^{2} & = \operatorname{tr} \left( {\left( X - Y \right)}^{T} \left( X - Y \right) \right) & \text{By definition of $ {\left\| \cdot \right\|}_{F}^{2} $} \\ & = \operatorname{tr} \left( {X}^{T} X \right) - 2 \operatorname{tr} \left( {X}^{T} Y\right) + \operatorname{tr} \left( {Y}^{T} Y \right) && \text{} \\ & \Rightarrow \arg \min_{X \in \mathcal{O}^{n}} {\left\| X - Y \right\|}_{F}^{2} = \arg \max_{X \in \mathcal{O}^{n}} \operatorname{tr} \left( {X}^{T} Y \right) & \text{} \\ & = \arg \max_{X \in \mathcal{O}^{n}} \operatorname{tr} \left( {X}^{T} U S {V}^{T} \right) & \text{The SVD of $ Y = U S {V}^{T} $} \\ & = \arg \max_{X \in \mathcal{O}^{n}} \operatorname{tr} \left( {V}^{T} {X}^{T} U S \right) & \text{Cyclic property of the Trace} \end{align*}$$

Defining $ Z = {V}^{T} {X}^{T} U $ we get (Pay attention that $ {Z}^{T} Z = I $, Namely it is an Orthogonal Matrix):

$$ \operatorname{tr} \left( Z S \right) = \sum_{i = 1}^{n} {z}_{ii} {\sigma}_{ii} \leq \sum_{i = 1}^{n} {\sigma}_{i} $$

Where $ {\sigma}_{i} = {s}_{ii} $ the Singular Values of $ Y $ (Diagonal Elements of the Diagonal Matrix $ S $).
The inequality is a result of $ Z $ being an Orthogonal Matrix hence its elements obey $ \left| {z}_{ij} \right| \leq 1 $ and equality is achieved when $ Z = I $.
If $ Z = I \Rightarrow {X}^{T} = V {U}^{T} \Rightarrow X = U {V}^{T} $.

Hence the answer is given by $ X = U {V}^{T} $.