A while ago I read the machine learning paper Deep Equilibrium Models, which proposes to use a neural network model as follows: (1) during the forward pass, run a fixed point iteration $z_{n+1} = f(z_n, x)$ until convergence. To update the model via gradient descent, one can use the implicit function theorem.
This got me thinking: What kind of functions can be computed in this way? I suspect that similar to RNNs, one could maybe show Turing completeness (Siegelmann & Sontag 1992). First we need a formalization of what we mean with "fixed-point computability". Towards this end, I came up with the following definition:
We say a computable function $f\colon X\to Y$ is fixed-point computable if and only if there exists a finite dimensional space $Z$, an elementary computable function $g$ and constants $c_1$, $c_2$ such that the following diagram commutes:
- $X, Y,Z$ are assumed to be finite dimensional real vector spaces (or appropriate subsets thereof).
- Regarding elementary computability, a couple of different definitions seem possible:
- Polynomial functions, i.e. everything that can be done with a finite number of additions and products
- Rational functions, i.e. everything that can be done with a finite number of additions, multiplications, or divisions.
- Piece-wise Polynomial/Rational Functions, i.e. everything that can be done with a finite number of additions, multiplications/divisions or Iverson Brackets / if-else clauses based on the elementary field relations $(<, ≤, =, ≥, >)$, so in particular including functions like $\operatorname{sign}(x)=[x≥ 0]-[x≤ 0]$ or $\max(x, y)= x⋅[x ≥ y] + y⋅[y> x]$
- Note that all of the above function classes have the property that they map rational numbers to rational numbers (given that the coefficients are rational as well)
- Assung convergence, $g^\infty$ denotes the function $g^∞(ξ)=\lim\limits_{n→∞} \underbrace{(g∘g∘…∘ g)}_{n\;\text{times}}(ξ)$
- Note that $g^∞$ does not necessarily map rationals to rationals, even if $g$ does.
- The dimensionality of $Z$ can be interpreted as the "buffer memory" required to carry out the computation
It seems that already with this first draft of a definition, we can show many computable functions are in fact fixed point computable, by bringing them into the required form.
Example
If a function can be represented by a Taylor Series whose coefficients can be computed iteratively by elementary functions, we can bring it into "fixed-point form". Consider for instance
$$ g(x, y, k) = \begin{pmatrix}x, & y+\frac{x^{1/k}}{(1/k)!}, & \frac{1}{\frac{1}{k}+1}\end{pmatrix}$$
Which, when initialized with $y=1$, $k=1$ reproduced the series:
$$ g^{(n)}(x, 1, 1) = \Big(x, ∑_{k=0}^n \frac{x^n}{n!}, \frac{1}{n}\Big) ⟶(x, e^x, 0) $$
Thus, we have proven that the exponential function is (rational) fixed-point computable.
Questions
- Is this a useful definition? How could it be improved? Does there already exist the concept of fixed-point computablity somewhere in the literature?
- Are all computable functions fixed-point computable?
