Matrix derivative in images matching problem

94 Views Asked by At

Problem

Suppose zero-centered matrices $\mathbf{X}$ and $\mathbf{Y}$ of shape $\mathbb{R}^{n\times 2}$. Each row of $\mathbf{X}$ and $\mathbf{Y}$ represents a point on 2-D plane. Therefore, they each represent a shape on the 2-D plane.

If we want to match $\mathbf{X}$ by transforming $\mathbf{Y}$ with scaling factor $\beta$ and rotation $\mathbf{O}\in \mathbb{R}^{2\times 2}$, i.e. $$ \min_{\beta, \mathbf{O}} \Vert \mathbf{X}-\beta \mathbf{YO}\Vert_F^2 $$ Derive optimal $\beta$ and $O$.

What I Have Done

This seems to be a pretty easy question, since $$ \begin{aligned} f(\beta,\mathbf{O}) &:= \Vert \mathbf{X}-\beta \mathbf{YO}\Vert_F^2\\ &=tr(\mathbf{XX}^T)-2\beta tr(\mathbf{Y}^T\mathbf{XO}^T)+\beta^2 tr(\mathbf{Y}^T\mathbf{Y}\mathbf{OO}^T) \end{aligned} $$ Then by following formula $$ \nabla_\mathbf{X}tr(\mathbf{AX}^T)=\mathbf{A}\\ \nabla_\mathbf{X}tr(\mathbf{BXX}^T)=\mathbf{BX}+\mathbf{B}^T\mathbf{X} $$ I have $$ \begin{aligned} \nabla_\mathbf{O}f(\beta,\mathbf{O}) &=-2\beta\mathbf{Y}^T\mathbf{X}+2\beta^2\mathbf{Y}^T\mathbf{YO}\\ \nabla_\beta f(\beta,\mathbf{O}) &=-2tr(\mathbf{Y}^T\mathbf{XO}^T)+2\beta tr(\mathbf{Y}^T\mathbf{Y}\mathbf{OO}^T) \end{aligned} $$ However, setting two equations to 0 does not seem to result in something solvable.

Could someone help me, thank you in advance!

2

There are 2 best solutions below

0
On BEST ANSWER

Optimal $\beta$ and $O$ are $$\beta_* = \pm\frac{\operatorname{tr}(\Sigma)}{\|Y\|_F^2}\,,\quad O_*=\operatorname{sign}(\beta)UV^T\,,$$ where $Y^TX=U\Sigma V^T$ is the SVD of the matrix $Y^TX$ and $$\operatorname{sign}(x)=\begin{cases}\hphantom{-}1\,,\quad x\geq 0\\-1\,,\quad x<0\end{cases}\,.$$

To show that, let $\beta$ and $O$ be arbitrary. We have $$f(\beta,O) \geq \min_\hat{O} \|X-(\beta Y)\hat{O}\|_F^2\,.$$ The right hand side is Orthogonal Procrustes problem with solution $$\hat{O}=\operatorname{sign}(\beta)UV^T = \operatorname{sign}(\beta)Q_*\,,$$ because $(\beta Y)^TX = (\operatorname{sign}(\beta)U)(|\beta|\Sigma)V^T$ is the SVD of matrix $(\beta Y)^TX$. Therefore, \begin{align} f(\beta,O)&\geq f(\beta, \operatorname{sign}(\beta)O_*)\\ &= \|X\|_F^2-2\beta\operatorname{tr}(Y^TX\operatorname{sign}(\beta)O_*^T)+\beta^2\|Y\|_F^2\\ &= \|X\|_F^2-2|\beta|\operatorname{tr}(\Sigma) + \beta^2\|Y\|_F^2\\ &\geq f(\beta_*, O_*) \,. \end{align} The last inequality holds because quadratic polynomial in $|\beta|$ attains its minimum for $|\beta|=|\beta_*|$ with value of minimum being $f(\beta_*, O_*) = f(-\beta_*,-O_*)$.

1
On

Let's assume that we know the optimal $\beta$ and use it to define the matrix $C=\beta Y.\,\,$ Let's also assume that we have calculated the SVD of $Y^TX$ $$\eqalign{ &Y^TX &= US_yV^T \,\,\,\implies C^TX &= U(\beta S_y)V^T = US_cV^T \\ }$$ The problem can now be re-cast in the form of the standard Orthogonal Procrustes problem, whose solution can be expressed in terms of this singular value decompostion. $$\eqalign{ &\min_O \|X-&CO\|_F \,\,\,\implies O = UV^T \\ }$$ NB:   The optimal $O$ is independent of $\beta$.
Given this optimal $O$, the optimal $\beta$ is obtained by solving a much easier problem. $$ \min_\beta\|X-\beta\,YO\|_F^2 \,\,\,\implies \beta=\frac{{\rm tr}\big(X^TYO\big)}{{\rm tr}\big((YO)^TYO\big)} =\frac{{\rm tr}(S_y)}{{\rm tr}(Y^TY)} $$