In this lab, we will use a Convolutional Neral Networks to classify horizontal an vertical Lines
Author
Juma Shafara
Published
August 12, 2024
Keywords
Training Two Parameter, Mini-Batch Gradient Decent, Training Two Parameter Mini-Batch Gradient Decent
Objective for this Notebook
Learn how to use a Convolutional Neural Network to classify handwritten digits from the MNIST database
Learn hot to reshape the images to make them faster to process
Table of Contents
In this lab, we will use a Convolutional Neural Network to classify handwritten digits from the MNIST database. We will reshape the images to make them faster to process
Estimated Time Needed: 25 min 14 min to train model
Don’t Miss Any Updates!
Before we continue, I have a humble request, to be among the first to hear about future updates of the course materials, simply enter your email below, follow us on (formally Twitter), or subscribe to our YouTube channel.
Preparation
# Import the libraries we need to use in this lab# Using the following line code to install the torchvision library# !mamba install -y torchvision!pip install torchvision==0.9.1 torch==1.8.1import torch import torch.nn as nnimport torchvision.transforms as transformsimport torchvision.datasets as dsetsimport matplotlib.pylab as pltimport numpy as np
ERROR: Ignored the following yanked versions: 0.1.6, 0.1.7, 0.1.8, 0.1.9, 0.2.0, 0.2.1, 0.2.2, 0.2.2.post2, 0.2.2.post3
ERROR: Could not find a version that satisfies the requirement torchvision==0.9.1 (from versions: 0.17.0, 0.17.1, 0.17.2, 0.18.0, 0.18.1, 0.19.0)
ERROR: No matching distribution found for torchvision==0.9.1
Define the function plot_channels to plot out the kernel parameters of each channel
# Define the function for plotting the channelsdef plot_channels(W): n_out = W.shape[0] n_in = W.shape[1] w_min = W.min().item() w_max = W.max().item() fig, axes = plt.subplots(n_out, n_in) fig.subplots_adjust(hspace=0.1) out_index =0 in_index =0#plot outputs as rows inputs as columns for ax in axes.flat:if in_index > n_in-1: out_index = out_index +1 in_index =0 ax.imshow(W[out_index, in_index, :, :], vmin=w_min, vmax=w_max, cmap='seismic') ax.set_yticklabels([]) ax.set_xticklabels([]) in_index = in_index +1 plt.show()
Define the function plot_parameters to plot out the kernel parameters of each channel with Multiple outputs .
# Define the function for plotting the parametersdef plot_parameters(W, number_rows=1, name="", i=0): W = W.data[:, i, :, :] n_filters = W.shape[0] w_min = W.min().item() w_max = W.max().item() fig, axes = plt.subplots(number_rows, n_filters // number_rows) fig.subplots_adjust(hspace=0.4)for i, ax inenumerate(axes.flat):if i < n_filters:# Set the label for the sub-plot. ax.set_xlabel("kernel:{0}".format(i +1))# Plot the image. ax.imshow(W[i, :], vmin=w_min, vmax=w_max, cmap='seismic') ax.set_xticks([]) ax.set_yticks([]) plt.suptitle(name, fontsize=10) plt.show()
Define the function plot_activation to plot out the activations of the Convolutional layers
# Define the function for plotting the activationsdef plot_activations(A, number_rows=1, name="", i=0): A = A[0, :, :, :].detach().numpy() n_activations = A.shape[0] A_min = A.min().item() A_max = A.max().item() fig, axes = plt.subplots(number_rows, n_activations // number_rows) fig.subplots_adjust(hspace =0.4)for i, ax inenumerate(axes.flat):if i < n_activations:# Set the label for the sub-plot. ax.set_xlabel("activation:{0}".format(i +1))# Plot the image. ax.imshow(A[i, :], vmin=A_min, vmax=A_max, cmap='seismic') ax.set_xticks([]) ax.set_yticks([]) plt.show()
Define the function show_data to plot out data samples as images.
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz to ./data/MNIST/raw/train-images-idx3-ubyte.gz
100.0%
Extracting ./data/MNIST/raw/train-images-idx3-ubyte.gz to ./data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz to ./data/MNIST/raw/train-labels-idx1-ubyte.gz
100.0%
Extracting ./data/MNIST/raw/train-labels-idx1-ubyte.gz to ./data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw/t10k-images-idx3-ubyte.gz
100.0%
Extracting ./data/MNIST/raw/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz
100.0%
Extracting ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw
Load the testing dataset by setting the parameters train False.
# Make the validating validation_dataset = dsets.MNIST(root='./data', train=False, download=True, transform=composed)
We can see the data type is long.
# Show the data type for each element in datasettype(train_dataset[0][1])
int
Each element in the rectangular tensor corresponds to a number representing a pixel intensity as demonstrated by the following image.
Print out the fourth label
# The label for the fourth data elementtrain_dataset[3][1]
1
Plot the fourth sample
# The image for the fourth data elementshow_data(train_dataset[3])
The fourth sample is a “1”.
Build a Convolutional Neural Network Class
Build a Convolutional Network class with two Convolutional layers and one fully connected layer. Pre-determine the size of the final output matrix. The parameters in the constructor are the number of output channels for the first and second layer.
class CNN(nn.Module):# Contructordef__init__(self, out_1=16, out_2=32):super(CNN, self).__init__()self.cnn1 = nn.Conv2d(in_channels=1, out_channels=out_1, kernel_size=5, padding=2)self.maxpool1=nn.MaxPool2d(kernel_size=2)self.cnn2 = nn.Conv2d(in_channels=out_1, out_channels=out_2, kernel_size=5, stride=1, padding=2)self.maxpool2=nn.MaxPool2d(kernel_size=2)self.fc1 = nn.Linear(out_2 *4*4, 10)# Predictiondef forward(self, x): x =self.cnn1(x) x = torch.relu(x) x =self.maxpool1(x) x =self.cnn2(x) x = torch.relu(x) x =self.maxpool2(x) x = x.view(x.size(0), -1) x =self.fc1(x)return x# Outputs in each stepsdef activations(self, x):#outputs activation this is not necessary z1 =self.cnn1(x) a1 = torch.relu(z1) out =self.maxpool1(a1) z2 =self.cnn2(out) a2 = torch.relu(z2) out1 =self.maxpool2(a2) out = out.view(out.size(0),-1)return z1, a1, z2, a2, out1,out
Define the Convolutional Neural Network Classifier, Criterion function, Optimizer and Train the Model
There are 16 output channels for the first layer, and 32 output channels for the second layer
# Create the model object using CNN classmodel = CNN(out_1=16, out_2=32)
Plot the model parameters for the kernels before training the kernels. The kernels are initialized randomly.
# Plot the parametersplot_parameters(model.state_dict()['cnn1.weight'], number_rows=4, name="1st layer kernels before training ")plot_parameters(model.state_dict()['cnn2.weight'], number_rows=4, name='2nd layer kernels before training' )
Define the loss function, the optimizer and the dataset loader
Train the model and determine validation accuracy technically test accuracy (This may take a long time)
# Train the modeln_epochs=3cost_list=[]accuracy_list=[]N_test=len(validation_dataset)COST=0def train_model(n_epochs):for epoch inrange(n_epochs): COST=0for x, y in train_loader: optimizer.zero_grad() z = model(x) loss = criterion(z, y) loss.backward() optimizer.step() COST+=loss.data cost_list.append(COST) correct=0#perform a prediction on the validation data for x_test, y_test in validation_loader: z = model(x_test) _, yhat = torch.max(z.data, 1) correct += (yhat == y_test).sum().item() accuracy = correct / N_test accuracy_list.append(accuracy)train_model(n_epochs)
Analyze Results
Plot the loss and accuracy on the validation data:
# Plot the loss and accuracyfig, ax1 = plt.subplots()color ='tab:red'ax1.plot(cost_list, color=color)ax1.set_xlabel('epoch', color=color)ax1.set_ylabel('Cost', color=color)ax1.tick_params(axis='y', color=color)ax2 = ax1.twinx() color ='tab:blue'ax2.set_ylabel('accuracy', color=color) ax2.set_xlabel('epoch', color=color)ax2.plot( accuracy_list, color=color)ax2.tick_params(axis='y', color=color)fig.tight_layout()
View the results of the parameters for the Convolutional layers
# Plot the channelsplot_channels(model.state_dict()['cnn1.weight'])plot_channels(model.state_dict()['cnn2.weight'])
Consider the following sample
# Show the second imageshow_data(train_dataset[1])
Determine the activations
# Use the CNN activations class to see the stepsout = model.activations(train_dataset[1][0].view(1, 1, IMAGE_SIZE, IMAGE_SIZE))
Plot out the first set of activations
# Plot the outputs after the first CNNplot_activations(out[0], number_rows=4, name="Output after the 1st CNN")
The image below is the result after applying the relu activation function
# Plot the outputs after the first Reluplot_activations(out[1], number_rows=4, name="Output after the 1st Relu")
The image below is the result of the activation map after the second output layer.
# Plot the outputs after the second CNNplot_activations(out[2], number_rows=32//4, name="Output after the 2nd CNN")
The image below is the result of the activation map after applying the second relu
# Plot the outputs after the second Reluplot_activations(out[3], number_rows=4, name="Output after the 2nd Relu")
We can see the result for the third sample
# Show the third imageshow_data(train_dataset[2])
# Use the CNN activations class to see the stepsout = model.activations(train_dataset[2][0].view(1, 1, IMAGE_SIZE, IMAGE_SIZE))
# Plot the outputs after the first CNNplot_activations(out[0], number_rows=4, name="Output after the 1st CNN")
# Plot the outputs after the first Reluplot_activations(out[1], number_rows=4, name="Output after the 1st Relu")
# Plot the outputs after the second CNNplot_activations(out[2], number_rows=32//4, name="Output after the 2nd CNN")
# Plot the outputs after the second Reluplot_activations(out[3], number_rows=4, name="Output after the 2nd Relu")
Plot the first five mis-classified samples:
# Plot the mis-classified samplescount =0for x, y in torch.utils.data.DataLoader(dataset=validation_dataset, batch_size=1): z = model(x) _, yhat = torch.max(z, 1)if yhat != y: show_data((x, y)) plt.show()print("yhat: ",yhat) count +=1if count >=5:break