I was trying to prove that the biorthogonal projector matrix $P$ given on the Wikipedia page on biorthogonal systems does in fact construct biorthogonal systems from input bases $\mathbf{u}$ and $\mathbf{v}$. However, I ran into trouble proving it, and tests in Mathematica suggest that it doesn't actually work (either that, or I'm misunderstanding the notation).
According to Wikipedia, given input bases $\mathbf{u}=\{\mathbf{u}_1,\mathbf{u}_2,...,\mathbf{u}_n\}$ and $\mathbf{v}=\{\mathbf{v}_1,\mathbf{v}_2,...,\mathbf{v}_n\}$, we can create biorthonormal bases $\tilde{\mathbf{u}}$ and $\tilde{\mathbf{v}}$ by first constructing a projector $$P=\sum_{i,j}(S^{-1})_{ji}\mathbf{u}_i\otimes\mathbf{v}_j$$ where $$S_{ij}=\left<\mathbf{v}_i,\mathbf{u}_j\right>$$ and where $\otimes$ has been defined so that $(\mathbf{a}\otimes\mathbf{b})\mathbf{c}=\mathbf{a}\left<\mathbf{b},\mathbf{c}\right>$. Then the new biorthogonal system is given by $$\tilde{\mathbf{u}}_i=(I-P)\mathbf{u}_i$$ and $$\tilde{\mathbf{v}}_i=(I-P^\dagger)\mathbf{v}_i.$$
Before posting my attempted derivation which failed and asking for people to find the mistake there, I think it'd be easier to post the following Mathematica code, which attempts to test the projection theorem on a pair of randomly-generated $3\times3$ bases:
u = RandomComplex[{-1 - I, 1 + I}, {3, 3}];
v = RandomComplex[{-1 - I, 1 + I}, {3, 3}];
Sinv = Inverse[Table[Conjugate[v[[i]]].u[[j]], {i, 3}, {j, 3}]];
P = Sum[Sinv[[j, i]] (u[[i]]\[TensorProduct]Conjugate[v[[j]]]), {i,
3}, {j, 3}];
i3 = IdentityMatrix[3];
uP[i_] := (i3 - P).u[[i]];
vP[i_] := (i3 - P\[ConjugateTranspose]).v[[i]];
MatrixForm[Table[Conjugate[vP[i]].uP[j], {i, 3}, {j, 3}]]
(*Out*)
{{7.59523 - 68.8161 I,
39.1464 - 95.1744 I, -11.6915 + 54.6066 I}, {-25.9465 -
50.3697 I, -16.2526 - 82.5312 I,
16.5557 + 42.4485 I}, {61.6702 - 33.3398 I,
101.63 - 22.3467 I, -51.9317 + 21.7888 I}}
As is plainly obvious, the biorthogonality condition fails utterly for the projected bases $\tilde{\mathbf{u}}_i:=\mbox{uP}[[i]]$ and $\tilde{\mathbf{v}}_i:=\mbox{vP}[[i]]$. Does anyone versed in biorthogonal or wavelet math see what I am doing wrong in the above implementation?
I posted this answer in the mathematica.se version of this question, but I think it belongs here rather than the programming side. I'm also including an addendum with some further thoughts which I think mostly resolve the issue.
If you change the order of the indices of $S$ in the definition of $P$, to obtain
your output matrix is essentially zero, and can be brought to that form using
Chop. The projector $P$ is then equal to the identity matrix. I don't know if that is the intended behaviour, but it is what I would expect from that inner product. Specifically,$$ \begin{align} \langle\tilde v_i,\tilde u_j\rangle &=v_i^\dagger(1-P)^2u_j=v_i^\dagger(1-P)u_j \quad\quad\quad\quad \text{ as }P^2=P \\&=\langle v_i,u_j\rangle-v_i\sum_{kl}(S^{-1})_{kl}u_k\otimes v_l\cdot u_j \\&=\langle v_i,u_j\rangle-\sum_{kl}(S^{-1})_{kl}\langle v_i,u_k\rangle\langle v_l,u_j\rangle \\&=S_{ij}-\sum_{kl}S_{ik}(S^{-1})_{kl}S_{lj} \\&\equiv0. \end{align} $$
I'll keep thinking about this but that seems to be what's going on. (For instance, given Wikipedia's expression for $P$, it's easy to see that $v_iPu_j=S_{ij}$ and therefore $P=1$.) I'm not quite clear what the 'point' of the construction is. Constructing a biorthogonal set given two sets of vectors is a bit of an overdetermined problem - given the $u$s you can construct, essentially uniquely, the $v$s, and vice versa.
What I imagine this construction is unsuccessfully trying to do is to find new $u$s and $v$s which are biorthogonal and as close as possible to the old ones. In that case the $S$ in $P$ would have a square root on it - or something - and the procedure would be the biorthogonality version of symmetric orthogonalization. How does that ring?
Addendum
As regards what your construction is doing, I think I have a better idea now. The projector $$P=\sum_{ij}S^{-1}_{ij}u_i\otimes v_j$$ you construct already has the properties you expect from the desired projector $P=\sum_i \tilde u_i\otimes \tilde v_i$. Specifically, it equals the identity if the $u$s and $v$s are bases of your vector space, or the rank-$|\{i\}|$ operator whose image is the span of the $u$s and whose kernel is the orthogonal complement to the $v$s.
The only problem is that it is not in the Schmidt form that's needed for the construction. I see three ways to achieve this.
This is the procedure I described above: constructing the $v$s given the $u$s or vice-versa. The third procedure is kind of in between:
This last method reduces to the symmetric orthogonalization method when the $u$s and $v$s coincide, and as such I would expect it to be the solution of some minimization problem regarding the distances between the $u$s and $\tilde u$s and the $v$s and $\tilde v$s. However, it is on shakier ground than that algorithm: in the standard version, you know that the inner-product Gram matrix $S_{ij}=\langle v_i,v_j\rangle$ is positive definite, and there's no problem talking square roots and so on. Here, you must do more work on the existence and unicity sides.
Further, the minimization problem would be intrinsically ill-defined. The reason for this is that in your problem, the inner product structure is actually a semi-superfluous addition. The $u$s naturally live in your vector space $V$ but the $v$s live naturally in its dual; the problem of biorthogonal systems is best phrased as finding the dual basis of a set of vectors. You've been given bases of subspaces of $V$ and $V^\ast$, your biorthogonalization procedure asks for a pair of dual bases for those two subspaces.
You can introduce an inner product to provide a distance in $V$ (which can also be lifted to $V^\ast$), and this will let you phrase the minimization problem for the distance between $u$s and $\tilde u$s and that between the $v$s and $\tilde v$s. However, the two distances are very different, as their endpoints live in different spaces. (If your vectors had units, for example, the two distances would be incommensurable, I think.) This means their relative weight in a minimization problem would be up for debate, and what the relative weight is will determine whether the $u$s are closer to the $\tilde u$s or the $v$s closer to the $\tilde v$s. This would correspond to the multiple ways of achieving a condition of the form $T_1ST_2=1$.
How does this ring?