Previously, we talked about simple OR gates and now we’ll continue that discussion with AND gates, and specifically the role of bias units. We often neglect to consider the role bias plays in our models. We know that we should include bias units, but why? Here, I’ll walk through a short example using an AND gate to highlight the importance of the bias unit.
Bias units allow us to offset the model in the same way that an intercept allows us to offset a regression line.
Imagine a simple AND gate. It will only fire if both inputs are true:
Input 1
Input 2
Output
0
0
0
0
1
0
1
0
0
1
1
1
This relationship can be visualized like this:
No bias units
As you can see from the plot, the linear separator cannot both cross the origin (0,0) and correctly split the categories, so we need to add a bias unit to offset the model.
What would happen if we omitted the bias unit? The following code creates a logistic regression model without bias (2 inputs, 1 output, sigmoid activation):
The training curve below shows that gradient descent is not able to converge on appropriate parameter values due to the lack of a bias unit:
As a result, the model makes incorrect predictions of 0.5 for the test data:
Adding a bias
The correct model for an AND gate must include a bias unit to offset the separator, and can be implemented as follows:
The training curve shows that gradient descent now converges correctly:
And we get correct predicted values:
Bias units are typically added to machine learning models, and AND gates are a simple way of highlighting why they are important.
The full code can be found in my GitHub repo here (no bias) and here (with bias).