Procrustes algorithm with non-square datasets causing issues with power

84 Views Asked by At

Context

I'm trying to use Procrustes algorithm as defined in this answer. When trying to run this I get an error saying the matrix isn't square so it cant calculate the power of the matrix:

ssX = (X0**2.).sum()
numpy.linalg.LinAlgError: Last 2 dimensions of the array must be square

I am trying to run this on a 2 sets of 3D data. My data sets have more than 3 samples so the matrix is not square.

m1 = np.matrix([
    [-0.00094267033484768, 3.789799175254147, 2.4715257885677078],
    [0.000008778767356273685, 6.2150805690386655, 2.4982455326331405],
    [-2.79731065127375, 6.2740562225362595, 2.4225278487013413],
    [-3.458455665603633, 4.114474872668245, 2.3984035708594433]
])

m2 = np.matrix([
    [5.899574117219649e-7, 3.791199954596434, 2.4355532529225092],
    [0.000001126895912213531, 6.212222782120383, 2.4355532529225092],
    [-2.795034044707283, 6.270332385163654, 2.419455889353544],
    [-3.4603112770442217, 4.113220009268678, 2.422246989768669]
])

Question

I read that a way around this is to convert the matrix to an array. I'm curious if that is mathematically equivalent, I assume not? I do not have the original code that it was converted from, so I'm not sure but I'm guessing since you can't power a non square matrix, the original is wrong and this is actually the correct intention. Instead of trying to do A x A I'm guessing supposed to just raise the power element wise. My math experience is limited so I'm hoping someone with experience can let me know if this is correct. My understanding is that this is the Frobenius norm that its calculating.

(np.array(X0)**2).sum()

Full Algorithm

Full code example that was converted from matlab to python by ali_m:

def procrustes(X, Y, scaling=True, reflection='best'):
    """
    A port of MATLAB's `procrustes` function to Numpy.

    Procrustes analysis determines a linear transformation (translation,
    reflection, orthogonal rotation and scaling) of the points in Y to best
    conform them to the points in matrix X, using the sum of squared errors
    as the goodness of fit criterion.

        d, Z, [tform] = procrustes(X, Y)

    Inputs:
    ------------
    X, Y    
        matrices of target and input coordinates. they must have equal
        numbers of  points (rows), but Y may have fewer dimensions
        (columns) than X.

    scaling 
        if False, the scaling component of the transformation is forced
        to 1

    reflection
        if 'best' (default), the transformation solution may or may not
        include a reflection component, depending on which fits the data
        best. setting reflection to True or False forces a solution with
        reflection or no reflection respectively.

    Outputs
    ------------
    d       
        the residual sum of squared errors, normalized according to a
        measure of the scale of X, ((X - X.mean(0))**2).sum()

    Z
        the matrix of transformed Y-values

    tform   
        a dict specifying the rotation, translation and scaling that
        maps X --> Y

    """

    n,m = X.shape
    ny,my = Y.shape

    muX = X.mean(0)
    muY = Y.mean(0)

    X0 = X - muX
    Y0 = Y - muY

    ssX = (X0**2.).sum()
    ssY = (Y0**2.).sum()

    # centred Frobenius norm
    normX = np.sqrt(ssX)
    normY = np.sqrt(ssY)

    # scale to equal (unit) norm
    X0 /= normX
    Y0 /= normY

    if my < m:
        Y0 = np.concatenate((Y0, np.zeros(n, m-my)),0)

    # optimum rotation matrix of Y
    A = np.dot(X0.T, Y0)
    U,s,Vt = np.linalg.svd(A,full_matrices=False)
    V = Vt.T
    T = np.dot(V, U.T)

    if reflection != 'best':

        # does the current solution use a reflection?
        have_reflection = np.linalg.det(T) < 0

        # if that's not what was specified, force another reflection
        if reflection != have_reflection:
            V[:,-1] *= -1
            s[-1] *= -1
            T = np.dot(V, U.T)

    traceTA = s.sum()

    if scaling:

        # optimum scaling of Y
        b = traceTA * normX / normY

        # standarised distance between X and b*Y*T + c
        d = 1 - traceTA**2

        # transformed coords
        Z = normX*traceTA*np.dot(Y0, T) + muX

    else:
        b = 1
        d = 1 + ssY/ssX - 2 * traceTA * normY / normX
        Z = normY*np.dot(Y0, T) + muX

    # transformation matrix
    if my < m:
        T = T[:my,:]
    c = muX - b*np.dot(muY, T)
    
    #transformation values 
    tform = {'rotation':T, 'scale':b, 'translation':c}
   
    return d, Z, tform
2

There are 2 best solutions below

1
On BEST ANSWER

Yes, I agree that the intention was simply that ssX is simply supposed to be the sum of the squares of every entry in $X_0$, so the **2. looks like it is meant to be an element-wise power.

(In particular, it's a very believable translation error considering the source is Matlab. I dimly recall that Matlab is weird about the * symbol versus *., one of which is matrix multiplication and one is element-wise multiplication. But it's been quite a few years since I've used that software.)

0
On

The problem is that numpy will treat a numpy array slightly different to a numpy matrix.

I am pretty sure that the code is meant to take the input matrix in the form of a a numpy array and not a numpy matrix. Since the offending line is (i think) supposed to compute the sum of the squared entries (which makes no sense if the operation would be the matrix multiplication).

For a numpy matrix the "*" operation will be matrix multiplication:

import numpy as np
matrix1 = np.matrix([[1,2],[3,4]])
print(matrix1*matrix1)

outputs:

[[ 7 10]
 [15 22]]

and "**" will be the matrix power:

print(matrix1**2)

outputs:

[[ 7 10]
 [15 22]]

But for a numpy array "*" is componentwise multiplication and "**" is component wise power:

array1 = np.array([[1,2],[3,4]])
print(array1*array1)

outputs:

[[ 1  4]
 [ 9 16]]

For a numpy array matrix multiplication is instead denoted by "@":

print(array1@array1)

outputs:

[[ 7 10]
 [15 22]]