I'm attempting to implement a computer algebra function using the combinatoric version of Faà di Bruno's formula presented by Michael Hardy in Combinatorics of Partial Derivatives that "collapses" partitions to account for multiple variables. The paper is mostly very well-written and intelligible (its examples are used in the Wikipedia article) but there's one thing I'm unclear about.
To give an example:
I do the following:
- Compute the integer partition of the order represented as nested sequences
- Take the unmixed partial of f at the order corresponding the number of blocks in each partition
- Compose it with g
- Multiply the composition with the mixed partials of f corresponding to the blocks in the partition
- Sum the functions corresponding to each partition
I'm currently distributing the multiplication at each order/partition rather than collapsing partitions and multiplying by a scalar, so I'm duplicating some work but am just trying to get it correct right now).
I think the problem is that I'm misunderstanding the composition at each order, i.e. that f'''(y) in Hardy's example is not, if fact, the unmixed second-order partial of f composed with g. I just can't think of anything else that could be meant by notation like f''(y)(dy/x1 * dy^2/dx2xdx3).
Any clarification would be greatly appreciated.

Perhaps if we look at the earlier terms it may help. [If it does not address your difficulty then I will happily delete it.]
We are looking at a composition $f(y(x_1,\dots,x_n))$. I think that this may be what is confusing you. I don't have any reference for a formula when $f$ is multivariate.
Then $$ \frac{\partial}{\partial x_1}f(y(x_1,\dots,x_n)) = f'(y(x_1,\dots,x_n))\cdot\frac{\partial}{\partial x_1}y(x_1,\dots,x_n) $$ by the Chain Rule.
Next $$ \frac{\partial^2}{\partial x_2\partial x_1}f(y(x_1,\dots,x_n)) = \frac{\partial}{\partial x_2}\left(f'(y(x_1,\dots,x_n))\cdot\frac{\partial}{\partial x_1}y(x_1,\dots,x_n)\right) $$ and to this we apply the Product Rule and so $$ \frac{\partial^2}{\partial x_2\partial x_1}f(y(x_1,\dots,x_n)) = \frac{\partial}{\partial x_2} \left(f'(y(x_1,\dots,x_n))\right)\cdot\frac{\partial}{\partial x_1}y(x_1,\dots,x_n) +\\ f'(y(x_1,\dots,x_n))\cdot\frac{\partial}{\partial x_2}\left(\frac{\partial}{\partial x_1}y(x_1,\dots,x_n)\right) $$ Now apply to the first term the Chain Rule argument (we have $f'$ instead of $f$ and $x_2$ instead of $x_1$): we get $$ \frac{\partial^2}{\partial x_2\partial x_1}f(y(x_1,\dots,x_n)) = f''(y(x_1,\dots,x_n))\cdot\frac{\partial}{\partial x_2}y(x_1,\dots,x_n)\frac{\partial}{\partial x_1}y(x_1,\dots,x_n) +\\ f'(y(x_1,\dots,x_n))\cdot\frac{\partial^2}{\partial x_2\partial x_1}y(x_1,\dots,x_n) $$
Now differentiate with respect to $x_3$. At each stage you need to use the Product Rule, and then on the $f^{(n)}((y(x_1,\dots,x_n))$ you must use the Chain Rule. Faà di Bruno's formula gives a combinatorial explanation of which products of partial derivatives of $y$ occur in the multiplier of $f^{(k)}(y(x_1,\dots,x_n)$.