Can a (bounded) linear least-squares problem include a scale factor in its solution?

161 Views Asked by At

I have a system of equations.

$$ \small \begin{aligned} (1 - p_0)u_0 + (q_0 - 1)v_0 + u_1 - v_1 = r_0 - c t_0 \\\\ (1 - p_1)u_1 + (q_1 - 1)v_1 + u_0 - v_0 = r_1 - c t_1 \\\\ (1 - p_2)u_2 + (q_2 - 1)v_2 + u_3 - v_3 + u_4 - v_4 = r_2 - c t_2 \\\\ (1 - p_3)u_3 + (q_3 - 1)v_3 + u_2 - v_2 + u_4 - v_4 = r_3 - c t_3 \\\\ (1 - p_4)u_4 + (q_4 - 1)v_4 + u_2 - v_2 + u_3 - v_3 = r_4 - c t_4 \end{aligned} $$

Expressed in matrix form, the coefficient matrix can always be formulated as two adjacent block-diagonal matrices.

$$ \scriptsize \begin{bmatrix} 1 - p_0 & 1 & 0 & 0 & 0 & q_0 - 1 & -1 & 0 & 0 & 0 \\\\ 1 & 1 - p_1 & 0 & 0 & 0 & -1 & q_1 - 1 & 0 & 0 & 0 \\\\ 0 & 0 & 1 - p_2 & 1 & 1 & 0 & 0 & q_2 - 1 & -1 & -1 \\\\ 0 & 0 & 1 & 1 - p_3 & 1 & 0 & 0 & -1 & q_3 - 1 & -1 \\\\ 0 & 0 & 1 & 1 & 1 - p_4 & 0 & 0 & -1 & -1 & q_4 - 1 \end{bmatrix} \begin{bmatrix} u_0 \\\\ u_1 \\\\ u_2 \\\\ u_3 \\\\ u_4 \\\\ v_0 \\\\ v_1 \\\\ v_2 \\\\ v_3 \\\\ v_4 \end{bmatrix} =\begin{bmatrix} r_0 - c t_0 \\\\ r_1 - c t_1 \\\\ r_2 - c t_2 \\\\ r_3 - c t_3 \\\\ r_4 - c t_4 \end{bmatrix} $$

This can, of course, be solved to get all $u_i$ and $v_i$.

My question is about the actual math but, for completeness, I am using the Python function scipy.optimize.lsq_linear, which supports bounds, because each of these has a lower bound of $0$ and a positive upper bound. Additionally, I am aware that the known block-diagonal shape allows this problem to be trivially divided into smaller sub-problems, which would be particularly desirable since my real problem has around 80 equations instead of the 5 shown. However, I leave it in this form because I also want to calculate the scale factor $c$ as part of one big solution.

To this end, the problem can be slightly rearranged.

$$ \scriptsize \begin{bmatrix} 1 - p_0 & 1 & 0 & 0 & 0 & q_0 - 1 & -1 & 0 & 0 & 0 & t_0 \\\\ 1 & 1 - p_1 & 0 & 0 & 0 & -1 & q_1 - 1 & 0 & 0 & 0 & t_1 \\\\ 0 & 0 & 1 - p_2 & 1 & 1 & 0 & 0 & q_2 - 1 & -1 & -1 & t_2 \\\\ 0 & 0 & 1 & 1 - p_3 & 1 & 0 & 0 & -1 & q_3 - 1 & -1 & t_3 \\\\ 0 & 0 & 1 & 1 & 1 - p_4 & 0 & 0 & -1 & -1 & q_4 - 1 & t_4 \end{bmatrix} \begin{bmatrix} u_0 \\\\ u_1 \\\\ u_2 \\\\ u_3 \\\\ u_4 \\\\ v_0 \\\\ v_1 \\\\ v_2 \\\\ v_3 \\\\ v_4 \\\\ c \end{bmatrix} =\begin{bmatrix} r_0 \\\\ r_1 \\\\ r_2 \\\\ r_3 \\\\ r_4 \end{bmatrix} $$

In my real problem, all $r_i$ start as $0$ (on the first iteration of my program) which means the trivial solution of setting $c = 0$ is always returned as the least-squares solution, all $t_i$ have no effect, and all all $r_i$ stay as $0$. What I really want is a way to get the largest $c$, up to its own positive upper bound $k$, while still finding a least-squares solution for everything else. I could add another row to the matrix and allow $c$ to simply target its largest possible value $k$ but I am not sure if this is a good idea. I think it will dominate the least-squares calculation, because $k$ can be quite large compared to everything else, and I am not sure how I would choose a sensible scaling for this to achieve the correct balance.

$$ \scriptsize \begin{bmatrix} 1 - p_0 & 1 & 0 & 0 & 0 & q_0 - 1 & -1 & 0 & 0 & 0 & t_0 \\\\ 1 & 1 - p_1 & 0 & 0 & 0 & -1 & q_1 - 1 & 0 & 0 & 0 & t_1 \\\\ 0 & 0 & 1 - p_2 & 1 & 1 & 0 & 0 & q_2 - 1 & -1 & -1 & t_2 \\\\ 0 & 0 & 1 & 1 - p_3 & 1 & 0 & 0 & -1 & q_3 - 1 & -1 & t_3 \\\\ 0 & 0 & 1 & 1 & 1 - p_4 & 0 & 0 & -1 & -1 & q_4 - 1 & t_4 \\\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} u_0 \\\\ u_1 \\\\ u_2 \\\\ u_3 \\\\ u_4 \\\\ v_0 \\\\ v_1 \\\\ v_2 \\\\ v_3 \\\\ v_4 \\\\ c \end{bmatrix} =\begin{bmatrix} r_0 \\\\ r_1 \\\\ r_2 \\\\ r_3 \\\\ r_4 \\\\ k \end{bmatrix} $$

Is there a better way to formulate this type of situation, or is this a known type of problem with a standard solution?

1

There are 1 best solutions below

2
On BEST ANSWER

a possible arrangement to determine the optimal $c$.

$ \cases{ p_{12} = (p_1, \ p_2)'\\ p_{35} = (p_3, \ p_4, \ p_5)'\\ q_{12} = (q_1, \ q_2)'\\ q_{35} = (q_3, \ q_4, \ q_5)'\\ P_{12}=\unicode{x1D7D9}_2-\ulcorner p_{12}\lrcorner\\ P_{35}=\unicode{x1D7D9}_3-\ulcorner p_{35}\lrcorner\\ Q_{12}=\unicode{x1D7D9}_2-\ulcorner q_{12}\lrcorner\\ Q_{35}=\unicode{x1D7D9}_3-\ulcorner q_{35}\lrcorner\\ u_{12} = (u_1, \ u_2)'\\ u_{35} = (u_3, \ u_4, \ u_5)'\\ v_{12} = (v_1, \ v_2)'\\ v_{35} = (v_3, \ v_4, \ v_5)'\\ r_{12} = (r_1, \ r_2)'\\ r_{35} = (r_3, \ r_4, \ r_5)'\\ t_{12} = (t_1, \ t_2)'\\ t_{35} = (t_3, \ t_4, \ t_5)'\\ } $

Now to minimize $L(U,V,c)$

$$ L(U,V,c)=\|P_{12}u_{12}-Q_{12}v_{12}+c r_{12}-t_{12}\|^2+\|P_{25}u_{25}-Q_{25}v_{25}+c r_{25}-t_{25}\|^2 $$

the stationary conditions are

$$ \cases{ \frac{\partial L}{\partial u_{12}}=P'_{12}(P_{12}u_{12}-Q_{12}v_{12}+c r_{12}-t_{12})=0\\ \frac{\partial L}{\partial v_{12}}=Q'_{12}(P_{12}u_{12}-Q_{12}v_{12}+c r_{12}-t_{12})=0 } $$

$$ \cases{ \frac{\partial L}{\partial u_{35}}=P'_{35}(P_{35}u_{35}-Q_{35}v_{35}+c r_{35}-t_{35})=0\\ \frac{\partial L}{\partial v_{35}}=Q'_{35}(P_{35}u_{35}-Q_{35}v_{35}+c r_{35}-t_{35})=0\\ \frac{\partial L}{\partial c}=r'_{12}(P_{12}u_{12}-Q_{12}v_{12}+c r_{12}-t_{12})+r'_{35}(P_{35}u_{35}-Q_{35}v_{35}+c r_{35}-t_{35})=0\\ } $$

or

$$ \left[\matrix{P'_{12}P_{12}&-P'_{12}Q_{12}&0&0& P'_{12}r_{12}\\ -Q'_{12}P_{12}&Q'_{12}Q_{12}&0&0& -Q'_{12}r_{12}\\ 0 & 0 & P'_{35}P_{35}&-P'_{35}Q_{35}&P'_{35}r_{35}\\ 0 & 0 & -Q'_{35}P_{35}&Q'_{35}Q_{35}&-Q'_{35}r_{35}\\ r'_{12}P_{12}&-r'_{12}Q_{12}& r'_{35}P_{35}&-r'_{35}Q_{35}&r'_{12}r_{12}+r'_{35}r_{35} }\right]\left[\matrix{u_{12}\\ v_{12}\\ u_{35}\\ v_{35}\\ c}\right]=\left[\matrix{P'_{12}t_{12}\\ -Q'_{12}t_{12}\\ P'_{35}t_{35}\\ -Q'_{35}t_{35}\\ r'_{12}t_{12}+r'_{35}t_{35}}\right] $$

Solving this linear system for $U,V,c$ will furnish us the best $c$.