Can MLPs represent functions exactly for finite inputs?

99 Views Asked by At

The universal approximation theorem states given appropriate depth / width, an MLP can represent any continuous function with arbitrary precision $ \epsilon > 0 $. For discrete functions, $f: \mathcal{X} \Rightarrow \mathcal{Y}$, where there exists $n \in\mathbb{N}$ such that $n > |\mathcal{X}|, n > |\mathcal{Y}|$, is there a parameterization of an MLP such that it computes $f$ exactly?

I don't have much experience with the universal approximation theorem, but I remember it involves the construction indicator functions over a finite region, so my intuition says that this is indeed possible, though I don't have yet the requisite background to go about such a proof.

2

There are 2 best solutions below

0
On BEST ANSWER

Indeed, for a finite set of inputs $x_1, \dots, x_n \in \mathbb{R}^d$, we can obtain exact representation of a desired function $f: \mathbb{R}^d \rightarrow \mathbb{R}$. To be precise, for any $y_1, \dots, y_n$, there exists a neural network with one hidden ReLU-layer and $2n+d$ parameters that satisfies $f(x_i) = y_i$.

To see this, write $F(x) = \sum_{j=1}^n a_j \max(0, w^T x + b_j)$ for the neural network. As long as all $n$ values $w^T x_i$ are distinct (which is the case for almost any $w$), we can always find suitable $b_j$ such that the matrix $$\Phi = \left[ \max(0, w^T x_i + b_j) \right]_{i,j=1}^n$$ is triangular with non-vanishing diagonal (first reorder $i$ such that the sequence of $w^Tx_i$ is increasing, then choose suitable $b_i$).

Now, just let $a = \Phi^{-1} y$ where $y = (y_1, \dots, y_n)$ (of course, also reorder the $i$ in the same way as above) and observe that $F(x_i) = y_i =f(x_i)$ for all $i$.

0
On

It depends on $f$ and on the depth/width of the NLP. It should be possible given a MLP with width $\Omega(n)$ and depth at least 2. I don't think there's much more interesting that you can say.