I am watching a lecture from my professor at MIT, and he says that we can solve this system of linear equations using a projection. I’m not able to see how it done.
We are solving for $dx$ and $ds$ which are vectors:
$\mu dx + ds = 1 \mu$ where $\mu$ is a scalar, real number.
$ A dx = 0$
$(dy) A + ds = 0$
He says that since we know:
$dx \in A$
$ds \in A^\perp$
…then we can use a projection to solve for $dx$ and $ds$. (we will get $dy$ too I think, but we don’t care about its value). He say:
“Find a vector, that’s the all ones vector, and break it up into a part that is inside of A and a part that is perpendicular to A. This is a projection”.
I have not had any luck going about solving this. But I have a few confusions about this.
1) The equation $ A dx = 0$ does not mean that $dx$ is inside of A. It means it is in the nullspace of A, so I think that the professor is wrong here in saying that.
2) The professor is also wrong in saying that it is the “all ones” vector we are projecting. It is not the “all ones” vector that we are projecting, it is a scaled version of the all ones vector, which admittedly doesn’t change the direction, but still changes the solution to the linear system.
3) No matter what I use for the matrix A, when I solve this using a projection, I always get the $ds$ vector as the zero vector, and the $dx$ vector as having all non-zero values. This causes most of the equations to be violated. E.g., I get values for $dx$ s.t. $A dx \ne 0$
How do I solve this using a projection of the all ones vector?
P.S. I am watching https://www.youtube.com/watch?v=78sNnf3pOYs at approximately minute 53.
Here is all of my code to accomplish this:
import numpy as np
def project(A, mu):
'''
Args:
A (numpy matrix): rescaled matrix of values for constraints Ax=b
(remember that we are assuming that the rows of A are linearly independent,
which in linear programming means that the equality constraints are not redundant)
mu (numpy array): numpy array of mu times the ones vector
Output:
dx (numpy array): projection of mu onto col(A)
ds (numpy array): projection of mu onto null(A)
Raises:
ValueError if the dimensions for A and mu do not match up.
'''
e = np.asmatrix(np.ones(A.shape[1])).T
# get basis vectors of the columnspace of A:
# get matrix rank
n = np.linalg.matrix_rank(A)
# get QR decomposition
Q, R = np.linalg.qr(A)
# get first "n" columns of Q because they form a basis for the columnspace of A
A_basis = np.asmatrix(Q)[:,0:n]
A = A_basis
# remember that a projection is just a linear transformation:
proj_A = A*np.linalg.inv(A.T*A)*A.T
check_eigs = np.vectorize(lambda x: np.isclose(float(x),1.0) or np.isclose(float(x), 0.0))
proj_A_eigs = np.linalg.eig(proj_A)[0]
assert np.alltrue(check_eigs(proj_A_eigs)), 'projection matrix proj_A does not look like a projection matrix. the eigenvalues are '+str(proj_A_eigs)
# proj_A is a projection matrix, so we can just hit mu with it to get the projection of the ones vector
mu_vector = mu*e
dx = proj_A*mu_vector
# but we also need the *orthogonal* projection to get ds
# orthogonal projection is the original vector minus the projection
ds = mu_vector - dx
return dx, ds
You can see that the steps I take are fairly simple
- take as input $A$ and $\mu$
- get projection matrix of $A$ by first getting basis vectors of $A$ using QR decomposition
- use the projection matrix to project the $\mu$ vector onto $A$ (where what I'm calling "mu vector" is just $\mu1$)
- get orthogonal projection too, by just subtracting $dx$ from the $\mu$ vector
As a sanity check, I made sure that the variable in the code proj_A has eigenvalues of either 0 or 1, as the theory says.
How do I solve this using a projection of the all ones vector?
Starting at 53:26, the speaker has swapped the symbols $A$ and $A^\perp$. If $A \cdot \overline{dx} = \overline{0}$, (where I use overlines to indicate vectors because the MathJax "$\vec{dx}$" is awful) then $dx$ is perpendicular to each row vector in $A$, so $dx \in A^\perp$.
Note that $\mu \, \overline{dx} + \overline{ds} = \overline{1} \, \mu$ can be rescaled to $\overline{dx} + \frac{1}{\mu} \overline{ds} = \overline{1}$, so we can just as easily think of this as projecting $\overline{1}$ into a "perpendicular to the $Ax = b$ constraint ($\overline{dx}$) component" and a "scaled copy of the parallel component". Some prefer the equation with only one mention of $\mu$ and some prefer not having the division. Both represent equivalent computations.
When I try the easiest possible version of solving this system, with $A = 1$ (the $1 \times 1$ matrix containing $1$), I get $\overline{dx} = 0$, $\overline{dy} = -\mu$, and $\overline{ds} = \mu$. Are you sure you're solving the system correctly?
Edit:
The next simplest computation is $A = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}$. Projecting along the rows of $A$, we get $$ A \cdot \overline{dx} + \frac{1}{\mu} A \cdot \overline{ds} = A \cdot \overline{1} \text{.} $$ (In a little detail: the first row of $A \cdot \overline{dx}$ is the dot product of the first row of $A$ and $\overline{dx}$. The second row of $A \cdot \overline{dx}$ is the dot product of the second row of $A$ and $\overline{dx}$. So $A \cdot \overline{dx}$ gives the projection of $\overline{dx}$ onto the rowspace of $A$. Similarly for $A \cdot \overline{ds}$.) The first term is zero by the $A^\perp$ condition. Since $A$ is the identity, we have $$\frac{1}{\mu} \overline{ds} = \overline{1} $$ and $ds = \mu \overline{1}$. Then we compute $\overline{dx} = \overline{1} - \mu \overline{1} = (1-\mu) \overline{1}$.
From your code
this says the rows of $A$ are a basis and that we can project onto the rowspace they span. You code seems to project onto the columnspace. That would be projection onto $A^T$...