I understand that L needs to be square, but all the examples that I could find for doing decomposition on non-square matrices is when the number of columns is greater than the number of rows. How should L be squared when the number of rows is greater than the number of columns?
The way Cuda's LU decomposition function works is that it overwrites the matrix A by its factors. The causes both L and U to have the dimensions of the original matrix and I need figure out how to square the L factor when A is non-square.
As a guess I am trying to set the diagonals of the L matrix returned to me by the Cuda library's LU decomposition function to the identity and the rest to zeroes, but it does not seem to be the correct choice.
Going by the online calculator it seems that my initial guess of padding the off diagonal elements of L with zeroes and the diagonal with ones is correct. The reason why this is not working for me is because for some reason the Cuda library rather than solving for L, just leaves the elements of the original matrix where they were. This was due to me passing arguments in the incorrect order.