I want to initialize a non-square matrix with bases that are random but as different as possible in the input space to generate a random over-complete basis set.
If the matrix was square I could generate a random orthonomal basis set. However I have more rows than columns and so I would like to initialize this matrix randomly so as to generate a nice spread of bases which are not all orthogonal, but generally maximizing the angle between them all.
For example with a 2D input space and a 2D output space ($2\times 2$ matrix) one might have a basis vector along the $x$ axis and one along the $y$ axis, but if three bases were needed for a $3\times 2$ matrix then the vectors could be arranged at 120 degrees from each other so that the output values were as uncorrelated as possible given the over-complete representation.
I guess one could imagine adding bases one by one and repeatedly adjusting the set to spread as far apart as possible. I'm not sure how to do this though. It seems like a bit of a tricky iterative optimization problem with some kind of energy function that measures proximity of the basis. Ideas?
Whats interesting is that the optimal solution for this is probably a fixed geometric basis structure for a given matrix size, and the random differences would just be rotations in the input space.

I figured out how to do it. The method I use is to do stochastic gradient descent on the loss given by the maximum dot product between all pairs of basis vectors, thus maximizing the angle between all of them.
Here's the code: