Visualizing Convolutional Neural Networks using nolearn
Creating kernel and occulusion plots for a Convolutional Neural Network
March 18, 2018
Table of Contents
We previously talked about Convolutional Neural Networks (CNN) and how use them to recognize handwritten digits using Lasagne.
While we can manually extract kernel parameters to visualize weights and activation maps (as discussed in the previous post), the nolearn package offers an easy way to visualize different elements of CNNs.
nolearn is a wrapper around Lasagne (which itself is a wrapper around Theano), and offers some nice visualization options such as plotting occlusion maps to help diagnose the performance of a CNN model. Additionally, nolearn offers a very high level API that makes model training even simpler than with Lasagne.
In this short post I’ll go through how to train a simple CNN model using nolearn for the task of recognizing handwritten digits (MNIST database). You will need nolearn, Lasagne, and Theano.
To start, we need to import a number of layer modules from Lasagne, as well as visualization modules from nolearn:
Like before we need to load the MNIST data using the function supplied by the Lasagne tutorial:
Network architecture
Setting up the network architecture using nolearn is extremely straightforward, we simply define the layers (and their associated parameters) using a list:
Then we specify the learning-related parameters for our model, such as the number of epochs, learning rate, and regularization. We also have to pass in the layers that we previously defined:
With that, our model is ready to go! To train it we simply call the fit method with our training data:
Visualization options
Whereas in previous posts we had to manually save and plot our training curves, nolearn has a function to directly plot the training and validation loss values over epochs:
We can plot the learned kernels for any layer in the network:
We can also plot the activation/feature maps for a single image:
One of the nicest features offered by nolearn is plotting occlusion maps. Occlusion maps are created by occluding parts of the image and seeing how that affects the predictive power of the model. The critical parts of the image are the areas that are important for correct prediction:
There are other visualization options available such as drawing the network architecture (draw_to_notebook(net0)) and saliency maps (plot_saliency(net0, X_train[:5]). You can explore some of these other options in the nolearn tutorial here. Overall, nolearn is great as it provides the ability to visualize different aspects of the learned model without too much manual work.