Introduction to neural networks and how they can be used to model XOR gates
February 13, 2017
Table of Contents
We have previously discussed OR logic gates and the importance of bias units in AND gates. Here, we will introduce the XOR gate and show why logistic regression can’t model the non-linearity required for this particular problem.
As always, the full code for these examples can be found in my GitHub repository here.
XOR gates output True if either of the inputs are True, but not both. It acts like a more specific version of the OR gate:
Input 1
Input 2
Output
0
0
0
0
1
1
1
0
1
1
1
0
If we visualize the data space we’ll have a clearer sense of what causes the issue. As you can see, there is no linear separator that can effectively split the categories:
Logistic Regression
When we try to model this using logistic regression like the previous gate examples, we run into a problem:
The network is unable to learn the correct weights due to the solution being non-linear. We can see this by looking at the training curve:
Introducing neural networks
One way to solve this problem is by adding non-linearity to the model with a hidden layer, thus turning this into a neural network model.
As always, we begin with imports and defining the data and corresponding labels:
Next, we define some training parameters. hidden_layer_nodes is used to control the number of units/neurons in the hidden layer. This will be useful when we set up the network architecture later:
Then, define our tensors and weight arrays. Note the inclusion of b2 which is the bias for our hidden layer neurons and a second array of weights (w2_array) for the hidden layer to output layer connections.
We need some additional expressions to evaluate the output values of both the input (a1) and hidden (a2) layers:
Remember to add update rules for the additional weights (w2) and biases (b2) that we added:
We finish off with some Theano functions and the actual training process:
After training this neural network we can see that the cost correctly decreases over training iterations and outputs our correct predictions for the XOR gate: