How to use a projection to solve this linear system of equations?

1k Views Asked by At

I am watching a lecture from my professor at MIT, and he says that we can solve this system of linear equations using a projection. I’m not able to see how it done.

We are solving for $dx$ and $ds$ which are vectors:

$\mu dx + ds = 1 \mu$ where $\mu$ is a scalar, real number.

$ A dx = 0$

$(dy) A + ds = 0$

He says that since we know:

$dx \in A$

$ds \in A^\perp$

…then we can use a projection to solve for $dx$ and $ds$. (we will get $dy$ too I think, but we don’t care about its value). He say:

“Find a vector, that’s the all ones vector, and break it up into a part that is inside of A and a part that is perpendicular to A. This is a projection”.

I have not had any luck going about solving this. But I have a few confusions about this.

  • 1) The equation $ A dx = 0$ does not mean that $dx$ is inside of A. It means it is in the nullspace of A, so I think that the professor is wrong here in saying that.

  • 2) The professor is also wrong in saying that it is the “all ones” vector we are projecting. It is not the “all ones” vector that we are projecting, it is a scaled version of the all ones vector, which admittedly doesn’t change the direction, but still changes the solution to the linear system.

  • 3) No matter what I use for the matrix A, when I solve this using a projection, I always get the $ds$ vector as the zero vector, and the $dx$ vector as having all non-zero values. This causes most of the equations to be violated. E.g., I get values for $dx$ s.t. $A dx \ne 0$

How do I solve this using a projection of the all ones vector?

P.S. I am watching https://www.youtube.com/watch?v=78sNnf3pOYs at approximately minute 53.

Here is all of my code to accomplish this:

import numpy as np
def project(A, mu):
    ''' 
    Args:
        A (numpy matrix): rescaled matrix of values for constraints Ax=b
           (remember that we are assuming that the rows of A are linearly independent, 
            which in linear programming means that the equality constraints are not redundant)
        mu (numpy array): numpy array of mu times the ones vector 
    Output:
        dx (numpy array): projection of mu onto col(A)
        ds (numpy array): projection of mu onto null(A)
    Raises:
        ValueError if the dimensions for A and mu do not match up.
    '''
    e = np.asmatrix(np.ones(A.shape[1])).T

    # get basis vectors of the columnspace of A:
    # get matrix rank
    n = np.linalg.matrix_rank(A)
    # get QR decomposition
    Q, R = np.linalg.qr(A)
    # get first "n" columns of Q because they form a basis for the columnspace of A
    A_basis = np.asmatrix(Q)[:,0:n]
    A = A_basis
    # remember that a projection is just a linear transformation:
    proj_A = A*np.linalg.inv(A.T*A)*A.T
    check_eigs = np.vectorize(lambda x: np.isclose(float(x),1.0) or np.isclose(float(x), 0.0))
    proj_A_eigs = np.linalg.eig(proj_A)[0]
    assert np.alltrue(check_eigs(proj_A_eigs)), 'projection matrix proj_A does not look like a projection matrix. the eigenvalues are '+str(proj_A_eigs)
    # proj_A is a projection matrix, so we can just hit mu with it to get the projection of the ones vector
    mu_vector = mu*e
    dx = proj_A*mu_vector

    # but we also need the *orthogonal* projection to get ds
    # orthogonal projection is the original vector minus the projection
    ds = mu_vector - dx

    return dx, ds

You can see that the steps I take are fairly simple

  • take as input $A$ and $\mu$
  • get projection matrix of $A$ by first getting basis vectors of $A$ using QR decomposition
  • use the projection matrix to project the $\mu$ vector onto $A$ (where what I'm calling "mu vector" is just $\mu1$)
  • get orthogonal projection too, by just subtracting $dx$ from the $\mu$ vector

As a sanity check, I made sure that the variable in the code proj_A has eigenvalues of either 0 or 1, as the theory says.

How do I solve this using a projection of the all ones vector?

1

There are 1 best solutions below

4
On

Starting at 53:26, the speaker has swapped the symbols $A$ and $A^\perp$. If $A \cdot \overline{dx} = \overline{0}$, (where I use overlines to indicate vectors because the MathJax "$\vec{dx}$" is awful) then $dx$ is perpendicular to each row vector in $A$, so $dx \in A^\perp$.

Note that $\mu \, \overline{dx} + \overline{ds} = \overline{1} \, \mu$ can be rescaled to $\overline{dx} + \frac{1}{\mu} \overline{ds} = \overline{1}$, so we can just as easily think of this as projecting $\overline{1}$ into a "perpendicular to the $Ax = b$ constraint ($\overline{dx}$) component" and a "scaled copy of the parallel component". Some prefer the equation with only one mention of $\mu$ and some prefer not having the division. Both represent equivalent computations.

When I try the easiest possible version of solving this system, with $A = 1$ (the $1 \times 1$ matrix containing $1$), I get $\overline{dx} = 0$, $\overline{dy} = -\mu$, and $\overline{ds} = \mu$. Are you sure you're solving the system correctly?

Edit:

The next simplest computation is $A = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}$. Projecting along the rows of $A$, we get $$ A \cdot \overline{dx} + \frac{1}{\mu} A \cdot \overline{ds} = A \cdot \overline{1} \text{.} $$ (In a little detail: the first row of $A \cdot \overline{dx}$ is the dot product of the first row of $A$ and $\overline{dx}$. The second row of $A \cdot \overline{dx}$ is the dot product of the second row of $A$ and $\overline{dx}$. So $A \cdot \overline{dx}$ gives the projection of $\overline{dx}$ onto the rowspace of $A$. Similarly for $A \cdot \overline{ds}$.) The first term is zero by the $A^\perp$ condition. Since $A$ is the identity, we have $$\frac{1}{\mu} \overline{ds} = \overline{1} $$ and $ds = \mu \overline{1}$. Then we compute $\overline{dx} = \overline{1} - \mu \overline{1} = (1-\mu) \overline{1}$.

From your code

remember that we are assuming that the rows of A are linearly independent

this says the rows of $A$ are a basis and that we can project onto the rowspace they span. You code seems to project onto the columnspace. That would be projection onto $A^T$...