In a problem I need to approximate a vector $W \in \Bbb{R}^n$ with a binary vector $B \in \{-1,1\}^n$ as $W \approx \alpha B + \gamma1$ for $\alpha\ge0$ and $\gamma\in$ $\Bbb{R}$. To find an optimal estimation, the following optimization is solved:
$\min_{\alpha,B,\gamma} \| W - (\alpha B + \gamma1)\|_2^2$ $\ $ s.t. $\ $$B \in \{-1,1\}^n$, $\alpha\ge0$, $\gamma\in$ $\Bbb{R}$
by expanding the objective we have:
J($\alpha$,$B$,$\gamma$) = $\| W - (\alpha B + \gamma1)\|_2^2$ = $W^TW - 2\alpha W^TB - 2\gamma W^T + \alpha^2B^TB + 2\alpha\gamma B^T1 + \gamma^21^T1$
where the first term is not dependent on optimization variables and since $B \in \{-1,1\}^n$, $B^TB = n$. Also $1^T1 = n$. Thus the objective is:
F($\alpha$,$B$,$\gamma$) = $ -2\alpha W^TB - 2\gamma W^T + \alpha^2n + 2\alpha\gamma B^T1 + \gamma^2n$
for minimization of F, we first consider $\gamma$ to be fixed and K($\alpha,B$) = $\alpha^2n - 2\alpha(W^T-\gamma1^T)B$ is minimized. As the $\alpha$ is positive, it is enough to maximize ($W^T-\gamma 1^T)B$. Thus, the optimal solution for $B$ is:
$B^* = sign(W-\gamma 1)$.
By taking the derivative of K with respect to $\alpha$ and set it to zero and using $B^*$, we obtain the solution for $\alpha$ as:
$\alpha^*$$ = $$1\over{n}$$\|W-\gamma 1\|_1$
and based on these solutions we can calculate $\gamma$. Finally we iteratively compute these variables until convergence. However, this approach can only converge to local minimum of F($\alpha$,$B$,$\gamma$).
Is there a way to calculate globally optimal solution for this optimization problem?
<< Any advice or guidance would be greatly appreciated. >>
The extra treatment of $\gamma{\bf 1}$ unduly complicates the process without providing more degrees of freedom. Note that for any $\lambda$, $\mu\in{\mathbb R}$, where $\lambda\leq \mu$, one has $$\lambda(1,0)+\mu(0,1)={\mu-\lambda\over2}(-1,1)+{\lambda+\mu\over2}(1,1)\ ,$$ hence ${\mu-\lambda\over2}\geq0$ can stand in for $\alpha$ and ${\lambda+\mu\over2}$ for $\gamma$. The same is true if $(1,0)$ and $(0,1)$ are replaced by $(1^r,0^s)$ and $(0^r,1^s)$, where $1^r=\underbrace{1,1,\ldots,1}_r$.
In order to find the optimal approximation to the given $W$ we therefore assume that $W_1\leq W_2\leq\ldots\leq W_n$ and look for an approximant vector of the form $(\lambda,\ldots,\lambda,\mu,\ldots,\mu)$ with a suitable number of $\lambda$s and $\mu$s. To this end choose an $r\in[n-1]$ and minimize $$Q_r:=\sum_{i=1}^r(\lambda-W_i)^2+\sum_{i=r+1}^n(\mu-W_i)^2\ .$$ This leads to $$\lambda={1\over r}\sum_{i=1}^r W_i,\qquad \mu={1\over n-r}\sum_{i=r+1}^nW_i\ ,$$ and then to a simple formula for ${\rm min}_{\lambda,\>\mu}\> Q_r$. It remains to determine the optimal $r$. This can be done using binary search in $O(\log n)$ steps.