I want to get the optimal solution to the following equation
$$\mathop{\rm{min}}_ {\| \boldsymbol{v}_i \| = 1 \atop i= 1,2,3} \| \boldsymbol{w} - \boldsymbol{v}_3 \otimes \boldsymbol{v}_2 \otimes \boldsymbol{v}_1 \|^2$$
where $\boldsymbol{v}_i \in \mathbb{R}^{ p_i\times 1}$, $\boldsymbol{w} \in \mathbb{R}^{ p_1 p_2 p_3\times 1}$ is given and $\| \boldsymbol{w}\| = 1$.
I want to get the best estimate of $\boldsymbol{v}_i.$
I can get $p_{i+1}p_{i+2}$ $v_1^i/v_j^i$'s , $j = 2,\ldots,p_i$, where $v_j^i$ denotes the $j$-th entries of $\boldsymbol{v}_i$ and $i+1$, $i+2$ are computed modulo 3. Let $$\hat{\dfrac{v_1^i}{v_j^i}} = \dfrac{1}{p_{i+1}p_{i+2}} \sum v_1^i/v_j^i , j = 2,\ldots,p_i$$, then we can get the estimator $\hat{\boldsymbol{v}_i}$ under $\|\boldsymbol{v}_i\|=1.$
But I dont know whether this solution is optimal or appropriate.
$ \def\bbR#1{{\mathbb R}^{#1}} \def\BR#1{\left[#1\right]} \def\LR#1{\left(#1\right)} \def\op#1{\operatorname{#1}} \def\div#1{\op{div}\LR{#1}} \def\mod#1{\op{mod}\LR{#1}} \def\vecc#1{\op{vec}\LR{#1}} \def\trace#1{\op{Tr}\LR{#1}} \def\frob#1{\left\| #1 \right\|_F} \def\t{\otimes} \def\l{\lambda} \def\x{\hat x} \def\y{\hat y} \def\z{\hat z} \def\qiq{\quad\implies\quad} \def\p{\partial} \def\grad#1#2{\frac{\p #1}{\p #2}} \def\fracLR#1#2{\LR{\frac{#1}{#2}}} \def\fracBR#1#2{\BR{\frac{#1}{#2}}} \def\gradLR#1#2{\LR{\grad{#1}{#2}}} \def\Sijk{\sum_{i=1}^{I}\sum_{j=1}^{J}\sum_{k=1}^{K}\:} \def\w{w_{ijk}} $Define the standard basis vectors using an index which acts as a mnemonic for their dimensionality, i.e. $$ e_j\in\bbR{J},\quad e_k\in\bbR{K},\quad etc $$ A basis vector from a higher dimensional space can be written as a triple Kronecker product $(\t)$ of lower dimensional basis vectors, i.e. $$\eqalign{ &e_m = e_i\t e_j\t e_k \qiq M = IJK \\ &m = k + (j-1)K + (i-1)JK \\ &\qquad\; i = 1 \:+ \;\:\div{m-1,\,JK} \\ &\qquad\; j = 1 \:+ \;\:\div{\mod{m-1,\,JK},\,K} \\ &\qquad\; k = 1 \:+ \mod{\mod{m-1,\,JK},\,K} \\ }$$ Thus an arbitrary vector $a\in\bbR{M}$ can be expanded in two different ways $$\eqalign{ a &= \sum_{m=1}^{M}\: a_m\,e_m &= \Sijk a_{ijk}\;\:{e_i\t e_j\t e_k}\\ }$$ using the index mappings shown above.
The scalar coefficients are numerically identical $\:\LR{a_m = a_{ijk}}$.
The Frobenius product $(:)$ is extremely useful in Matrix Calculus $$\eqalign{ A:B &= \sum_{i=1}^m\sum_{j=1}^n A_{ij}B_{ij} \;=\; \trace{A^TB} \\ A:A &= \frob{A}^2 \qquad \{ {\rm Frobenius\;norm} \}\\ A:B &= B:A \;=\; B^T:A^T \\ C:\LR{AB} &= \LR{CB^T}:A \;=\; \LR{A^TC}:B \\ }$$ and it interacts nicely with the Kronecker product $$\eqalign{ \l &= \LR{A\otimes B\otimes C}:\LR{X\otimes Y\otimes Z} \\ &= \LR{A:X}\otimes\LR{B:Y}\otimes\LR{C:Z} \\ &= \LR{A:X}\,\LR{B:Y}\,\LR{C:Z} \\ }$$ The derivative of a unit vector is well known $$\eqalign{ \x &= \frac{x}{\|x\|} \qiq d\x &= \fracLR{I-\x\x^T}{\|x\|}dx \\ }$$
Write the objective function using the above notation $$\eqalign{ \l &= \tfrac12\LR{\x\t\y\t\z-w} : \LR{\x\t\y\t\z-w} \\ }$$ Assume that $(\y,\z)$ are fixed and try to find the optimal $\x$.
Calculate the gradient wrt the unconstrained $x$ vector $$\eqalign{ d\l &= \LR{\x\t\y\t\z-w} : \LR{d\x\t\y\t\z} \\ &= \LR{\y:\y}\LR{\z:\z}\,\LR{\x:d\x} \;-\; \Sijk\w\:\LR{e_j:\y}\LR{e_k:\z}\,\LR{e_i:d\x} \\ &= \LR{\x\;-\;\Sijk\w\:\y_j\z_k\,e_i}:d\x \\ &= \fracLR{I-\x\x^T}{\|x\|}\LR{\x\;-\;\Sijk\w\:\y_j\z_k\,e_i}:dx \\ &= \fracLR{\x\x^T-I}{\|x\|}\LR{\Sijk\w\:\y_j\z_k\,e_i}:dx \\ &= {\Sijk\w\:\y_j\z_k}\fracLR{\x_i\x-e_i}{\|x\|} : dx \\ \grad{\l}{x} &= {\Sijk\w\:\y_j\z_k}\fracLR{\x_i\x-e_i}{\|x\|} \\ }$$ Set the gradient to zero and calculate the least square solution for $\x$ $$\eqalign{ &\LR{{\Sijk\w\:\x_i\y_j\z_k}}\x = {\Sijk\w\:\y_j\z_k}\:e_i \\ &\x = \frac{{\Sijk\w\:\y_j\z_k}\:e_i}{{\Sijk\w\:\y_j\z_k\:\x_i}} \\ }$$ However, since $\x$ is a unit vector, it's easier to calculate the numerator and normalize it $$\eqalign{ x = {\Sijk\w\:\y_j\z_k}\:e_i \;\;\qiq \x = \frac{x}{\|x\|} \\ }$$ The least squares solutions for $(\y,\z)$ are completely analogous $$\eqalign{ &y = {\Sijk\w\:\x_i\z_k}\:e_j \qiq \y = \frac{y}{\|y\|} \\ &z = {\Sijk\w\:\x_i\y_j}\:e_k \qiq \z = \frac{z}{\|z\|} \\ }$$ Alternating Least Squares (ALS) iterations quickly converge to the optimal triplet $\LR{\x,\y,\z}$