Is Lagrange multiplier useful in this Constrained optimization problem?

293 Views Asked by At

I'm trying to numerically solve the following optimization problem. Let $u \in R^{n \times n_c}$ be the variable we wish to find. Let $\phi$ be the objective:

$\phi = Tr(u^\top L u) + 1/2 ||u - u^{obs} ||_2^2 $ st. $u e_{n_c} = 0_n$

where $u^{obs}$ is given, $L \in R^{n \times n}$ is a known graph Laplacian, and $e_{n_c}$ is the vector of ones, with length $n_c$.

We can rewrite this using Lagrange multipliers as:

$\phi = Tr(u^\top L u) + 1/2 ||u - u^{obs} ||_2^2 + \lambda_n^\top u e_{n_c}$

where $\lambda_n$ is the lagrange multiplier, which is a vector of length $n$. The theory now says to take the derivative of the objective with regards to $u$ and $\lambda$, which gives me:

$\frac{ \partial \phi }{\partial u} = 2 L u + (u - u^{obs}) + \lambda_n e_{n_c}^\top = 0_{n \times n_c}$

$\frac{\partial \phi} {\partial \lambda} = u e_{n_c} = 0_n$

Now the idea is to combine these two equations and eliminate $\lambda$ if possible, but I don't see any way I can combine this and get something I could solve algebraically or numerically.

Does anyone have any suggestions as to how I could proceed?

2

There are 2 best solutions below

3
On BEST ANSWER

The centering matrix is constructed from the identity matrix and the all-ones vector $$\eqalign{ C = I - \frac{ee^T}{e^Te} \quad\implies\quad C^2=C=C^T,\quad Ce = 0 }$$ Construct the matrix $U$ from $C$ and an unconstrained matrix variable $X$. $$\eqalign{ U &= XC \quad\implies\quad Ue &= 0,\quad UC &= U \\ }$$ Write the objective function and calculate its gradient wrt $X$. $$\eqalign{ \phi &= L:UU^T + \tfrac{1}{2}(U-U_{obs}):(U-U_{obs}) \\ d\phi &= L:(U\,dU^T+dU\,U^T) + (U-U_{obs}):dU \\ &= \Big((L+L^T)U + U-U_{obs}\Big):dU \\ &= \big(LU+L^TU + U-U_{obs}\big):dX\,C \\ &= (LU + L^TU + U - U_{obs}C):dX \\ &= ((L + L^T + I)U - U_{obs}C):dX \\ \frac{\partial \phi}{\partial X} &= (L + L^T + I)U - U_{obs}C \\ }$$ Set the gradient to zero and solve for the optimal matrix. $$\eqalign{ 0 &= (L + L^T + I)U - U_{obs}C \\ U &= (L + L^T + I)^{-1}U_{obs}C \\ }$$ NB: A colon is being used as a convenient product notation for the trace, i.e. $$A:B = {\rm Tr}(A^TB)$$

Terms in such products can be rearranged via the cyclic property of the trace, e.g. $$\eqalign{A:BC = B^TA:C = AC^T:B}$$

Update

There was a question about where $C$ came from.

Consider the general solution of a linear equation $$AX=B \quad\implies\quad A = BX^+ + Y(I-XX^+)$$ where $X^+$ denotes the pseudoinverse of $X$, the matrix $Y$ is arbitrary, and $(I-XX^+)$ is a projector into the nullspace, i.e. $\;(I-XX^+)X=0$.

For real vectors, the pseudoinverse can be written in terms of the transpose, i.e. $\;e^+=\frac{e^T}{e^Te}$.

Now consider the constraint as a linear equation to be solved for $U$, and note that $C$ is the nullspace projector for $e$.

0
On

I think I found an answer to my question, by using the Laplace multipler I can solve the combined system in vectorized form, which for $n_c=2$ looks like:

$$ \begin{pmatrix} \alpha L + I & 0 & I \\ 0 & \alpha L + I & I \\ I & I & 0 \\ \end{pmatrix} \begin{pmatrix} \bar{u}_1 \\ \bar{u}_2 \\ \lambda \\ \end{pmatrix} = \begin{pmatrix} \bar{u}^{obs}_1 \\ \bar{u}^{obs}_2 \\ 0 \\ \end{pmatrix} $$

where $ u = \begin{pmatrix} \vdots & \vdots \\ \bar{u}_1 & \bar{u}_2 \\ \vdots & \vdots \\ \end{pmatrix} $