Interpretability: Find the Essence of Filters

%pylab inline

import tensorflow as tf
tf.logging.set_verbosity(tf.logging.ERROR)

from keras import backend as K
Populating the interactive namespace from numpy and matplotlib


Using TensorFlow backend.

Recall the general architecture of the VGG model (Notes here).

from IPython.display import Image

Image('images/vgg_long.png')

png

If we load up the keras implementation of the weights (ignoring the top layers, as we’re not actually going to be predicting anything)

from keras.applications import VGG16

model = VGG16(weights='imagenet',
              include_top=False)

… we can see that it’s very large.

model.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, None, None, 3)     0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, None, None, 64)    1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, None, None, 64)    36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, None, None, 64)    0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, None, None, 128)   73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, None, None, 128)   147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, None, None, 128)   0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, None, None, 256)   295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, None, None, 256)   590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, None, None, 256)   590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, None, None, 256)   0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, None, None, 512)   1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, None, None, 512)   0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, None, None, 512)   0         
=================================================================
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0
_________________________________________________________________

This tutorial is meant to shed some light on the types of intermediate features that these hidden layers are looking at.

Methodology Overview

Broadly, we know that training a Neural Network involves:

  1. Setting the model structure
  2. Defining the loss
  3. Minimizing that loss

Which means that, crucially, the connections from layer to layer are differentiable and aggregate to some comprehensive loss function.

A good intermediate filter can:

  1. “Spot a particular pattern in our image”
  2. Pass some information to the next layer that confirms “yeah, I saw this pattern”

Over multiple training steps, this information passed to the next layer is adjusted by performing gradient descent to arrive at a series of weights that minimize this filter’s contribution to the loss of the whole model.

The key mechanic we’re going to be using here basically up-ends this whole idea. Instead of using the gradient to go from image to “singular information value” to minimize loss, we’re going to multiply by this gradient to make adjustments to the image to maximize loss.

With enough repetition, this will give us an image with higher and higher loss for this filter, which translates in English to “an image tailor made to activate this particular filter.”

Visualizing One Filter of One Layer

So to get a ground-level intuition for how this all works, we’re going to zero in on one filter of one block.

Arbitrarily-chosen, we’ve got:

layer_name = 'block3_conv1'
filter_index = 27

And so we grab the layer that we’re interested in examining

layer_output = model.get_layer(layer_name).output

It’s got 256 different filters

layer_output.shape
TensorShape([Dimension(None), Dimension(None), Dimension(None), Dimension(256)])

So we’ll just look at the filter we’re interested in. We create an object called loss, which will be the outbound calculation for the loss that the network is trained to minimize.

loss = K.mean(layer_output[:, :, :, filter_index])

Note: This isn’t the value of the loss– a cut-and-dry number– but instead, the variable that will hold the value of the loss as the network adjusts. We take the mean value to make the optimization more straight-forward.

type(loss)
tensorflow.python.framework.ops.Tensor
loss.shape
TensorShape([])

Similarly, we define an object, grads, that will hold the value of the loss gradient, used in our repeated ascent step.

Looking closer, K.gradients evaluates our loss function (defined above), at the point defined by the model.input– in this case, an image of a consistent size. Per usual, we normalize that value, and the resulting grads object provides a gradient value across (R, G, B)

grads = K.gradients(loss=loss, variables=model.input)[0]
grads /= (K.sqrt(K.mean(K.square(grads))) + 1e-5)

grads.shape
TensorShape([Dimension(None), Dimension(None), Dimension(None), Dimension(3)])

Next, in the spirit of “define the steps/placeholders”, we’ll use the K.function to define our iteration step.

Essentially, this lambda function will look for an image of our consistent size, and return loss and grads used to update our image at each step.

We’ll take one step over an image of all black to initialize the values of loss_value and grads_value.

iterate = K.function(inputs=[model.input], outputs=[loss, grads])

loss_value, grads_value = iterate([np.zeros((1, 150, 150, 3))])
type(grads)
tensorflow.python.framework.ops.Tensor

Then we create an image of just noise.

input_img_data = np.random.random((1, 150, 150, 3)) * 20 + 128.

plt.imshow(input_img_data[0] / 255.);

png

And loop over it a few hundred times, adding the gradients (dampened by step, our “learning rate”) as we go along

step = 0.1
losses = []
for i in range(1000):
    loss_value, grads_value = iterate([input_img_data])
    losses.append(loss_value)
    
    input_img_data += grads_value * step

After many iterations, we can see the general pattern that this layer is looking for (clipping values outside of [0.0, 1.0])

plt.imshow(np.clip(input_img_data[0] / 255., 0., 1.));

png

Visualizing Many Filters of Many Layers

Packaging the above code into a helper function called generate_pattern, we can easily plot out a filter in a specific layer

import helpers
plt.imshow(helpers.generate_pattern(model, 'block3_conv1', filter_index=0));

png

Then borrowing once more from Chollet, we can do some clever loop magicâ„¢ and peek at several filters in each layer

helpers.generate_layer(model, 'block1_conv1')

png

Look at how much more intricate the activations are in the later blocks of the Network

helpers.generate_layer(model, 'block4_conv3')

png

This is immediately consistent with out recurring notion of “increasing feature complexity, the deeper in the Network we go”