Dimensionality reduction with b-splines - does it have to have an inverse operation?

140 Views Asked by At

Part A

From LA course, my understanding is that:

  1. basis change (i.e. when we stay in the same dimension) amounts to change of coordinates, and is accomplished by simply multiplying change-of-basis matrix by original vector;

  2. projection, on the other hand, by definition entails decrease in dimension and [almost?] inevitably loss of data, and is accomplished via projection matrix that always involves inverse (or more likely pseudo-inverse).

If the above fully accurate? What have I missed?

Part B

(if this is more appropriate to CrossValidated, I'll show myself out)

in this R code:

# `data` is say 1000 x 50 data matrix
require(fda)
n = 1000  # no of observations
k = 5    # reduced dimensionality
x = seq(0, 1, length.out = n)
splinebasis_B = create.bspline.basis(c(0, 1), k)
base_B = eval.basis(x, splinebasis_B)
data_reduced = data %*% base_B

as I understand, the last operation does not project (or otherwise reducingly transform) data just yet, but rather it gives us data_reduced on which we can run some fitting algos (regressions, trees etc.).

For this reason, we cannot claim that data_reduced is compressed version of data.

Is this correct? Am I missing anything? What's the right way to think about data_reduced? It's clearly mapped to a lower-dimensional space, and there is loss as data's variance got collapsed during the multiplication with fewer columns... but I feel like somehow data_reduced is not "optimal". It's not the best representation of data in k-dimensional space... What is it then?

Appreciate your help.