Build matrix from existing vectors such that condition number of matrix is minimal

90 Views Asked by At

I have a set of $K\gg1$ vectors, each belonging to $\mathbb{R}^N$, where $N\in\mathbb{N}, N> 2$.

My goal is to select a subset of $M < K$ vectors from this set of $K$ vectors in order to form a matrix $A\in\mathbb{R}^{N \times M}$, where the columns are the $M$ chosen vectors. The objective is to choose these $M$ vectors in a way that minimizes the condition number $\kappa(A)$ of the resulting matrix.

Are there any common numerical algorithms for this problem? An approximate solution is sufficient.

What I have now implemented as a naive iterative solver works as follows:

Iteration Rule:

Initialization:
    Choose a predetermined starting vector from the given set.

Iteration Loop (Continue until M iterations are reached):

For each vector not present in the set of fixed vectors:
a. Augment the set by introducing an additional vector that has not been included previously.
b. Compute the condition number of the resultant matrix that arises from this augmentation.

Selection:
Elect the additional vector that minimizes the condition number of the overall resultant matrix.

Repeat these steps until the desired number of iterations, denoted as M, is achieved.

For small problems it will work but when $K$ gets larger I run into computational problems. So I am still looking for a more efficient way to solve this.

1

There are 1 best solutions below

0
On BEST ANSWER

The 2-norm condition number of a general matrix $A \in \mathbb{R}^{m \times n}$ is given by ratio of the singular values, specifically $$\kappa(A) = \frac{\sigma_{1}}{\sigma_{k}}$$ where $k = \min\{m,n\}$ and $\sigma_k$ is the smallest singular value $k$. We shall consider the case of $m \ge n$ where the matrix $A$ is tall. If $m < n$, then the matrix is wide and the smallest singular value is automatically $0$ and the condition number is either undefined ($0/0$) or $\infty$.

The suggestion by @whpowell96 is closely related to computing a QR factorization of the matrix $X$ given by $$ X = \begin{bmatrix} x_1 & x_2 & \dotsc & x_K \end{bmatrix} \in \mathbb{R}^{N \times K} $$ Here the columns of $X$ are simply the vectors that are given in advance. Specifically, we compute a QR factorization with column pivoting such that $$X \Pi = QR $$ where $\Pi \in \mathbb{R}^{K \times K}$ is a permutation matrix, $Q \in \mathbb{R}^{N \times N}$ is an orthogonal matrix, and $R \in \mathbb{R}^{N \times K}$ is an upper triangular matrix. Moreover, the permutation matrix is constructed to ensure that the diagonal entries of $R$ form a nonincreasing sequence. This precisely ensures the greedy property that @whpowell96 is pursuing. The next vector is automatically chosen to ensure that its projection of the orthogonal complement of the vector space spanned by the previously chosen vectors is maximal. This particular variant of the QR algorithm is discussed in detail in Golub and van Loan's text book: "Matrix computations". It is primarily used as rank revealing algorithm which is why it is also known as the rank revealing QR algorithm. When the algorithm completes, then you choose the first $M$ columns of $X\Pi$. They get selected because they where as a far a possible from being linearly dependent on one another.

A MATLAB implementation of QR algorithm with column pivoting follows below along with its dependencies

function [Q, R, pi]=ccGivensQRCP(A)

% ccGivensQR  Computes QR factorization with column pivoting
% 
% CALL SEQUENCE: [Q, R, pi]=ccGivensQR(A)
%
% INPUT:
%    A      a real matrix m by n matrix
%
% OUTPUT:
%    Q, R   real matrices such that A(:,pi) = QR where
%             Q  is an m by m orthogonal matrix,
%             R  is an m by n upper triangular matrix
%             pi is a permutation of length m
%
% Moreover, the diagonal elements of R are nonincreasing
% 
% MINIMAL WORKING EXAMPLE: ccGivensQRCP_mwe1

% PROGRAMMING by Carl Christian Kjelgaard Mikkelsen ([email protected])
%   2022-10-04  Code adapted from ccGivensQR

% Extract the dimension of the matrix
[m, n]=size(A);

% Initialize the output
Q=eye(m,m); R=A; pi=1:n;

% Allocate space for the column norms of A
colnorm=zeros(1,n);
% Compute the norm of the n columns of A
for j=1:n
    colnorm(j)=norm(R(:,j),2);
end

% Loop over the columns of R
for j=1:n
    % Find the column that has the largest norm
    idx=j;
    for i=j+1:n
        if colnorm(i)>colnorm(idx)
            idx=i;
        end
    end
    % At this point we know that colnorm(i)=max(colnorm)
    
    if idx>j 
        % Swap columns j and idx     
        R(:,[j idx])=R(:,[idx j]);
        % Swap the column norms
        colnorm([j idx])=colnorm([idx j]); 
        % Register the swap as well
        pi([j idx])=pi([idx j]);
    end
    % At this point, we know that the dominant column is leading    
        
    % Process the subdiagonal entries of R(:,j) from the bottom up
    % from the bottom up
    for i=m-1:-1:j
        % Givens rotation needed to nullify R(i+1,j) using R(i,j)
        [c, s, r]=ccGivens(R(i,j),R(i+1,j));
        % Define the matrix that represents Givens rotation 
        G=[c -s; s c]; 
        % Apply the rotation to rows i:i+1 ignoring leading zeros
        R(i:i+1,j+1:end)=G'*R(i:i+1,j+1:end);
        % Explicitly set the entries of R(i:i+1,j)
        R(i:i+1,j)=[r; 0];
        % Accumulate the transformation into Q
        Q(:,i:i+1)=Q(:,i:i+1)*G;
    end
    
    % Compute the norms of the columns of the matrix R(j+1:end,j+1:end)
    % Certainly, this can be done faster, but the slow procedure does 
    % not invite subtractive cancellation.
    for k=j+1:n
        colnorm(k)=norm(R(j+1:end,k),2);
    end
end

The Givens rotation is computed as follows:

function [c, s, r]=ccGivens(x,y)

% ccGivens  Computes a Givens rotation
%
% CALL SEQUENCE: [c, s, r]=ccGivens(x,y)
%
% INPUT:
%   x    real number
%   y    real number
%
% OUTPUT:
%   c,s  the cosine and sine such that
%          [c -s; s c]'*[x; y] = [r; 0]
%   r    the norm of the vector [x; y]
%
% MINIMAL WORKING EXAMPLE: ccGivens_mwe1

% PROGRAMMING by Carl Christian Kjelgaard Mikkelsen ([email protected])
%   2020-08-20  Initial programming and testing

% Compute the norm of the vector [x,y]
r=norm([x y]);

% Is this a special case?
if abs(x)~=0
    % No, apply the standard formula
    c=x/r; s=y/r;
else
    % Yes, rotate 90 degrees
    c=0; s=1;
end

A minimal working example that highlights some features of the algorithms is given here:

% Compute a QR factorization with column pivoting.

% Set the dimension of the tall matrix
m=7; n=6; 

% Select the rank r<=n;
r=3; k=n-3;

% Set the random seed
rng(2022);

% //////////////////////////////////////////////////////////
% Generate a random matrix of rank r
% //////////////////////////////////////////////////////////

% Almost certainly A will have r linearly independent columns 
A=rand(m,r); 
% Generate a random matrix and extend A to n columns
B=rand(k,k); A=[A A*B]; 
% The extra columns are linear combinations of the previous columns
% so we are not increasing the rank

% Shuffle the columns of A randomly
p=randperm(n); A=A(:,p);

% At this point A is almost certainly an m by n matrix of rank r.

% Compute the QR factorization
[Q, R, pi]=ccGivensQRCP(A);

% /////////////////////////////////////
% Verify the factorization
% /////////////////////////////////////

% Compute the residual S
S=A(:,pi)-Q*R; 

% Compute the relative residual
rres=norm(S)/norm(A);

% Check orthogonality
orth=norm(eye(m,m)-Q'*Q,2);

% Display results
fprintf('Relative residual = %8.4e\n',rres);
fprintf('Orthogonality     = %8.4e\n',orth);

% Change the format so that tiny nonzeros are emphasized
format short e

% Display the structure of R
display(R);
% Notice that the diagonal entries decay
% Notice that R has a bunch of tiny entries
% Notice that it is clear that R is close to a matrix of rank r.