AND logic gate in a neural network

967 Views Asked by At

I'm creating a neural network like this one in Excel, which is https://www.youtube.com/watch?v=3993kRqejHc

N= X1·W1 + X2·W2 + X3·W3.

Is it compulsory to add the logic function that appears in column I? How is it called strictly that function in AI? Wouldn't the neural network work without it?

2

There are 2 best solutions below

11
On BEST ANSWER

That function is called an Activation Function, and it is actually what makes Neuronal Networks interesting.

To give you an example, imagine you have a network with one input $x$ and one output $y$, if you ignore this function, the result is

$$ y = b + w x \tag{1} $$

where the parameters $b$ and $w$ are the numbers (weights) you need to find. The idea of training phase is to find the values of $a$ and $b$ that fit a bunch of training examples of the form $\{x_i, y_i \}$. So, you know what the output $y_1$ is when you feed the network an input $x_1$, also know what the output $y_2$ when you feed it $x_2$, ...

Point here is that Eq. (1) is just a straight line, so no matter how many training examples you use. So you may ask yourself, isn't the problem of fitting a line already solved with ordinary least squares? And the answer is yes, it is, no need for neuronal networks at all!

The obvious follow up question is then how to spice things up? And the answer will be, by introducing non-linearities

$$ y = f(b + w x) \tag{2} $$

There are several choices for $f$, this is a very simple one for binary classification

$$ f(x) = \begin{cases} 0 & x < 0 \\ 1 & {\rm otherwise}\end{cases} $$

here is another one

$$ f_{\rm sigmoid}(x) = \frac{1}{1 + e^{-x}} $$

and yet another one

$$ f_{\rm ReLU}(x) = \begin{cases} 0 & x < 0 \\ x & {\rm otherwise}\end{cases} $$

Each one has its merits, the last one has been recently used a lot for classifiers, but it kind of depends on the problem

0
On

"AND" gate implies both are True. The neural correlate of boolean true/false is the sigmoid function. Thus, we can multiply the sigmoid of two values to make a continuous correlate of "and"

def neural_and(a, b):
    return sigmoid(a) * sigmoid(b)

if both a and b are near one, then the value will be 1. If one of them is near zero, it will cancel the other, and thus the output will only be 1 if both a and b are near one

If your inputs are vector-valued, then you could use a Dense / Linear layer to map the inputs onto a scalar space, then apply this function

def neural_and(A, B):
    a = Dense(1)(A)
    b = Dense(1)(B)
    return sigmoid(a) * sigmoid(b)

Note: You can find Dense and sigmoid layers/functions within most neural net libraries

Finally, it's possible to do elementwise AND gates, if the Dense layers in the aforementioned function are higher-dimensional, or we simply don't use Dense layers at all

def elementwise_and(A, B):
    return sigmoid(A) * sigmoid(B)

# assuming channels-last
def elementwise_neural_and(A, B):
    n_units = A.shape[-1]
    a = Dense(n_units)(B)
    b = Dense(n_units)(B)
    return sigmoid(a) * sigmoid(b)

for NAND, just subtract the output of the previous AND function from 1. This makes a function which returns a 1 unless both A and B are greater than zero

def neural_nand(a, b): 
    return 1 - sigmoid(a) * sigmoid(b)

for OR gate, that's a maximum... if either a or b is 1, then it will output 1

def neural_or(a, b):
    return jnp.maximum(sigmoid(a), sigmoid(b))

then the NOR(a,b) gate is 1 - OR(a, b):

def neural_nor(a, b):
  return 1.0 - jnp.maximum(sigmoid(a), sigmoid(b))

Since NAND and NOR are functionally complete, we could expect either one to be pretty powerful in the right architecture. Not sure which one is more efficient, but Maximum seems like it would be more efficient than multiplication

Hope that helps