CNN Organization

In the Deep Learning notes space, we’ve got a notebook outlining a potential architecture for a network to predict on the MNist dataset. Here, we’ll walk through how to match that implementation in TensorFlow using Keras.

Overview

Our instantiation will basically look like the following:

  • Generate our Data
  • Create Placeholders
  • Create Variable objects

Data

The dataset is a smaller resolution, but the exercise is the same

from sklearn.datasets import load_digits

data = load_digits()

X = data['images']
y = data['target']

print(X.shape, y.shape)
(1797, 8, 8) (1797,)

Because we’re going to end in a softmax layer, we want to separate y into 10 distinct classes– not just their correct values.

from sklearn.preprocessing import OneHotEncoder

enc = OneHotEncoder()
sparse = enc.fit_transform(y.reshape(-1, 1))

y = sparse.todense()
print(y.shape)
(1797, 10)
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=.7)

# Second split to get dev/test set
X_dev, X_test, y_dev, y_test = train_test_split(X_test, y_test, train_size=.66)
C:\Users\nhounshell\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\model_selection\_split.py:2026: FutureWarning: From version 0.21, test_size will always complement train_size unless both are specified.
  FutureWarning)
print(X_train.shape, X_dev.shape, X_test.shape)
print(y_train.shape, y_dev.shape, y_test.shape)
(1257, 8, 8) (356, 8, 8) (184, 8, 8)
(1257, 10) (356, 10) (184, 10)

Build TensorFlow Graph

Note: Because the resolution is much smaller, we’ll comment out filtering steps– this would just takes us from some data to very little data, lol

from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten
from tensorflow.keras.layers import Conv1D, MaxPooling1D

Determine the shape of our inputs

m, X_w, X_h = X.shape
n_y = y.shape[1]

Instantiate the model

model = Sequential()

Layer 1

model.add(Conv1D(filters=6, kernel_size=(3), activation='relu', input_shape=(X_w, X_h)))
# model.add(MaxPooling1D(pool_size=2))

Layer 2

model.add(Conv1D(filters=6, kernel_size=(2), activation='relu'))
# model.add(MaxPooling1D(pool_size=2))

Graduating past Convolution

model.add(Flatten())

Fully-Connected layers

model.add(Dense(128, activation='relu'))
model.add(Dense(n_y, activation='softmax'))

Compile model with optimizer and loss function

model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adam(),
              metrics=['accuracy'])

Fit it

model.fit(X_train, y_train, batch_size=64, epochs=10, verbose=1,
          validation_data=(X_dev, y_dev))
Train on 1257 samples, validate on 356 samples
Epoch 1/10
1257/1257 [==============================] - 0s 319us/step - loss: 2.3054 - acc: 0.2609 - val_loss: 1.6242 - val_acc: 0.4438
Epoch 2/10
1257/1257 [==============================] - 0s 35us/step - loss: 1.3752 - acc: 0.5378 - val_loss: 1.1192 - val_acc: 0.6461
Epoch 3/10
1257/1257 [==============================] - 0s 38us/step - loss: 0.9413 - acc: 0.7049 - val_loss: 0.8343 - val_acc: 0.7331
Epoch 4/10
1257/1257 [==============================] - 0s 37us/step - loss: 0.6799 - acc: 0.7916 - val_loss: 0.6507 - val_acc: 0.8118
Epoch 5/10
1257/1257 [==============================] - 0s 41us/step - loss: 0.5051 - acc: 0.8560 - val_loss: 0.5233 - val_acc: 0.8315
Epoch 6/10
1257/1257 [==============================] - 0s 38us/step - loss: 0.4175 - acc: 0.8759 - val_loss: 0.4876 - val_acc: 0.8483
Epoch 7/10
1257/1257 [==============================] - 0s 40us/step - loss: 0.3402 - acc: 0.8926 - val_loss: 0.4500 - val_acc: 0.8680
Epoch 8/10
1257/1257 [==============================] - 0s 39us/step - loss: 0.3082 - acc: 0.9157 - val_loss: 0.3818 - val_acc: 0.8764
Epoch 9/10
1257/1257 [==============================] - 0s 35us/step - loss: 0.2591 - acc: 0.9212 - val_loss: 0.3377 - val_acc: 0.8989
Epoch 10/10
1257/1257 [==============================] - 0s 36us/step - loss: 0.2262 - acc: 0.9356 - val_loss: 0.3249 - val_acc: 0.9017





<tensorflow.python.keras.callbacks.History at 0x23584d340f0>

Evaluating

from sklearn.metrics import confusion_matrix
from sklearn.metrics import precision_score, recall_score, accuracy_score
precision_score(y_test.argmax(axis=1), model.predict(X_test).argmax(axis=1), average='macro')
0.8832403870639165
recall_score(y_test.argmax(axis=1), model.predict(X_test).argmax(axis=1), average='macro')
0.88765537856498944
accuracy_score(y_test.argmax(axis=1), model.predict(X_test).argmax(axis=1))
0.88043478260869568
confusion_matrix(y_test.argmax(axis=1), model.predict(X_test).argmax(axis=1))
array([[17,  0,  0,  0,  1,  0,  1,  0,  0,  0],
       [ 0, 13,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  0, 14,  0,  0,  0,  0,  0,  1,  1],
       [ 0,  0,  0, 18,  1,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0, 12,  0,  1,  1,  0,  0],
       [ 0,  0,  0,  0,  1, 20,  0,  0,  0,  2],
       [ 0,  0,  0,  0,  0,  0, 21,  0,  2,  0],
       [ 0,  0,  0,  1,  0,  0,  0, 14,  0,  0],
       [ 0,  1,  1,  1,  0,  4,  1,  0, 14,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  1,  0, 19]], dtype=int64)