Automatic differentiation paper: need help understanding syntax.

32 Views Asked by At

In this paper, the author says that differentiation is a "metafunction" that maps one function (e.g., $x^2$) to another ($2x$). On page two, he says

For now, give differentiation the following higher order type:

$d:: (a \rightarrow a) \rightarrow (a \rightarrow a)$

So far, it makes sense to me. Then he says the sum rule (i.e., for derivatives) is short hand for:

$d(\lambda x \rightarrow u \ x + v\ x) \equiv \lambda x \rightarrow d \ u \ x \ + \ d \ v \ x $

Where did $\lambda$ come from? What is the syntax of this thing trying to express?

The author continues by saying

These more precise formulations are tedious...Fortunately, there is an alternative...The trick is to add numeric overloadings for functions, so that numeric operations apply point-wise. For instance,

$u + v \equiv \lambda x \rightarrow u \ x + \ v \ x$

At this point, I no longer understand the expression at all. I would appreciate it if someone can explain the syntax and what the equivalence means.

2

There are 2 best solutions below

0
On

It looks to me like the author is using a rather abstract language to talk about "functions" in an abstract way rather than "values of functions".

Many algebra and Calculus text make the distinction between "f", a function, and "f(x)", the numeric value when the function is applied to "x".

When he writes $d(\lambda x\rightarrow ux+ vx)$ he is saying that the derivative is being applied to the function $\lambda$ which, itself applied to variable x, is ux+ vx. The last one, u+v≡λx→ ux+ vx, is again saying that "u+ v" is shorthand for the function, called $\lambda$, that takes the variable x to ux+ vx, in other words "u+v" is the function f(x)= (u+v)x.

0
On

"Where did $\lambda$ come from ?" From the so-called $\lambda$-calculus which is a whole part of logic elaborated in the 1940s by Alonso Church, subsequently implemented as a (famous) programming language, LISP by Mc Carthy at MIT 20 years or so later. A gentle introduction can be found here.

I am very surprised that the author of the paper mentions "manifold-calculus" (refering to the great name of Spivak) without mentionning anywhere that in fact he uses $\lambda$-calculus. This author has in fact rediscovered things that exist for a long time. Not a good point. See the Wikipedia article Automatic Differentiation about this whole and well developed subject.