Calculating and estimating the size of a minimal normal form

258 Views Asked by At

Each Boolean function $f:\{0,1\}^n \rightarrow \{0,1\}$ of $n$ variables has a minimal normal form. The number of terms of the minimal normal form is the minimal number of hidden neurons in a feed forward neural network that calculates the function. So one has to apply something like the Karnaugh map to find the minimal size of the hidden layer for a Boolean function, which may be given by its truth table. The truth table has $2^n$ entries which is also the size of the hidden layer of a canonical neural network to compute the function.

The Karnaugh map immediately gives you not only the size of the minimal normal form, but the minimal normal form itself. But what if you are only interested in the size (because you want to train the network)? How do I calculate the size of the minimal normal form possibly more quickly than by applying the Karnaugh map?

Furthermore: Are there any heuristics for estimating the size of the minimal normal form of a truth table even more quickly (with a given error distribution)?

1

There are 1 best solutions below

0
On

I can offer some results / sources:

  1. The problem you pose is hard. In fact, it has been shown in 2008 (see here) that the unrestricted minimization problem is $\Sigma _{2}^{P}$-complete.

  2. Karnaugh maps and equivalent Quine-McCluskey algorithms need prime implicants and an upper bound for their number has been shown (see here) to be $3^n/n$, so even higher than the number of minterms.

  3. Because of (2.), the standard heuristic nowadays is the Espresso algorithm and variants thereof, which for larger numbers of $n$ (as you will most likely be interested in for neural networks) has completely replaced Karnaugh maps and Quine-McCluskey algorithm.

  4. A design question's prominent article is "How many hidden layers and nodes?" by D. Stathakis - International Journal of Remote Sensing, 2009, [pdf]. In there is a discussion based on Kolmogorow complexity, resulting in a typical hidden layer size of (2n +1) neurons, which is fine if the correct activation function is chosen.

  5. An extremely well cited, yet older article is "What Size Net Gives Valid Generalization?" by Eric B. Baum and David Haussler, Neural computation, 1989. [pdf]. The authors argue in a probabilistic fashion, relying on Vapnik-Chervonenkis dimensions. The main finding is that in a network with $N$ nodes and $W$ weights, with some error $e$ you can expect the full Boolean mapping (all conjunctive normal forms CNF) to be correctly assigned if $m = {\cal O} (\frac We \log \frac Ne)$ randomly picked samples (CNFs) are correctly mapped. This holds for any distribution of the samples.

  6. A very recent overview is given in "Toward Deeper Understanding of Neural Networks: The Power of Initialization and a Dual View on Expressivity" by A Daniely, R Frostig and Y Singer, 2016 [pdf]. This heavily draws on kernel techniques.