Before we continue, I have a humble request, to be among the first to hear about future updates of the course materials, simply enter your email below, follow us on (formally Twitter), or subscribe to our YouTube channel.
Preparation
We’ll need the following libraries
# Uncomment the following line to install the torchvision library# !mamba install -y torchvision# Import the libraries we need for this labimport torchimport torch.nn as nnimport torchvision.transforms as transformsimport torchvision.datasets as dsetsimport matplotlib.pylab as pltimport numpy as np
Neural Network Module and Training Function
Define the neural network module or class using the sigmoid activation function:
# Build the model with sigmoid functionclass Net(nn.Module):# Constructordef__init__(self, D_in, H, D_out):super(Net, self).__init__()self.linear1 = nn.Linear(D_in, H)self.linear2 = nn.Linear(H, D_out)# Predictiondef forward(self, x): x = torch.sigmoid(self.linear1(x)) x =self.linear2(x)return x
Define the neural network module or class using the Tanh activation function:
# Build the model with Tanh functionclass NetTanh(nn.Module):# Constructordef__init__(self, D_in, H, D_out):super(NetTanh, self).__init__()self.linear1 = nn.Linear(D_in, H)self.linear2 = nn.Linear(H, D_out)# Predictiondef forward(self, x): x = torch.tanh(self.linear1(x)) x =self.linear2(x)return x
Define the neural network module or class using the Relu activation function:
# Build the model with Relu functionclass NetRelu(nn.Module):# Constructordef__init__(self, D_in, H, D_out):super(NetRelu, self).__init__()self.linear1 = nn.Linear(D_in, H)self.linear2 = nn.Linear(H, D_out)# Predictiondef forward(self, x): x = torch.relu(self.linear1(x)) x =self.linear2(x)return x
Define a function to train the model. In this case, the function returns a Python dictionary to store the training loss for each iteration and accuracy on the validation data.
# Define the function for training the modeldef train(model, criterion, train_loader, validation_loader, optimizer, epochs =100): i =0 useful_stuff = {'training_loss':[], 'validation_accuracy':[]} for epoch inrange(epochs):for i, (x, y) inenumerate(train_loader): optimizer.zero_grad() z = model(x.view(-1, 28*28)) loss = criterion(z, y) loss.backward() optimizer.step() useful_stuff['training_loss'].append(loss.item()) correct =0for x, y in validation_loader: z = model(x.view(-1, 28*28)) _, label=torch.max(z, 1) correct += (label == y).sum().item() accuracy =100* (correct /len(validation_dataset)) useful_stuff['validation_accuracy'].append(accuracy)return useful_stuff
Make Some Data
Load the training dataset by setting the parameters train to True and convert it to a tensor by placing a transform object in the argument transform.
# Create the training datasettrain_dataset = dsets.MNIST(root='./data', train=True, download=True, transform=transforms.ToTensor())
Load the testing dataset by setting the parameter train to False and convert it to a tensor by placing a transform object in the argument transform.
# Create the validation datasetvalidation_dataset = dsets.MNIST(root='./data', train=False, download=True, transform=transforms.ToTensor())
Create the criterion function:
# Create the criterion functioncriterion = nn.CrossEntropyLoss()
Create the training-data loader and the validation-data loader object:
# Create the training data loader and validation data loader objecttrain_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=2000, shuffle=True)validation_loader = torch.utils.data.DataLoader(dataset=validation_dataset, batch_size=5000, shuffle=False)
Define the Neural Network, Criterion Function, Optimizer, and Train the Model
Create the criterion function:
# Create the criterion functioncriterion = nn.CrossEntropyLoss()
Create the model with 100 hidden neurons:
# Create the model objectinput_dim =28*28hidden_dim =100output_dim =10model = Net(input_dim, hidden_dim, output_dim)
Test Sigmoid, Tanh, and Relu
Train the network by using the sigmoid activations function:
# Train a model with sigmoid functionlearning_rate =0.01optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)training_results = train(model, criterion, train_loader, validation_loader, optimizer, epochs=30)
Train the network by using the Tanh activations function:
# Train a model with Tanh functionmodel_Tanh = NetTanh(input_dim, hidden_dim, output_dim)optimizer = torch.optim.SGD(model_Tanh.parameters(), lr=learning_rate)training_results_tanch = train(model_Tanh, criterion, train_loader, validation_loader, optimizer, epochs=30)
Train the network by using the Relu activations function:
# Train a model with Relu functionmodelRelu = NetRelu(input_dim, hidden_dim, output_dim)optimizer = torch.optim.SGD(modelRelu.parameters(), lr=learning_rate)training_results_relu = train(modelRelu, criterion, train_loader, validation_loader, optimizer, epochs=30)
Analyze Results
Compare the training loss for each activation:
# Compare the training lossplt.plot(training_results_tanch['training_loss'], label='tanh')plt.plot(training_results['training_loss'], label='sigmoid')plt.plot(training_results_relu['training_loss'], label='relu')plt.ylabel('loss')plt.title('training loss iterations')plt.legend()plt.show()