SGD
author: Juma Shafara date: "2024-08-07" title: Training Two Parameter Stochastic Gradient Descent keywords: [] description: In this Lab, you will practice training a model by using Stochastic Gradient

Linear regression 1D: Training Two Parameter Stochastic Gradient Descent (SGD)
Objective
- How to use SGD(Stochastic Gradient Descent) to train the model.
Table of Contents
In this Lab, you will practice training a model by using Stochastic Gradient descent.
- Make Some Data
- Create the Model and Cost Function (Total Loss)
- Train the Model:Batch Gradient Descent
- Train the Model:Stochastic gradient descent
- Train the Model:Stochastic gradient descent with Data Loader
Estimated Time Needed: 30 min
Preparation
We'll need the following libraries:
The class plot_error_surfaces is just to help you visualize the data space and the parameter space during training and has nothing to do with PyTorch.
Make Some Data
Set random seed:
Generate values from -3 to 3 that create a line with a slope of 1 and a bias of -1. This is the line that you need to estimate. Add some noise to the data:
Plot the results:
Create the Model and Cost Function (Total Loss)
Define the forward function:
Define the cost or criterion function (MSE):
Create a plot_error_surfaces object to visualize the data space and the parameter space during training:
Train the Model: Batch Gradient Descent
Create model parameters w, b by setting the argument requires_grad to True because the system must learn it.
Set the learning rate to 0.1 and create an empty list LOSS for storing the loss for each iteration.
Define train_model function for train the model.
# The function for training the model
def train_model(iter):
# Loop
for epoch in range(iter):
# make a prediction
Yhat = forward(X)
# calculate the loss
loss = criterion(Yhat, Y)
# Section for plotting
get_surface.set_para_loss(w.data.tolist(), b.data.tolist(), loss.tolist())
get_surface.plot_ps()
# store the loss in the list LOSS_BGD
LOSS_BGD.append(loss)
# backward pass: compute gradient of the loss with respect to all the learnable parameters
loss.backward()
# update parameters slope and bias
w.data = w.data - lr * w.grad.data
b.data = b.data - lr * b.grad.data
# zero the gradients before running the backward pass
w.grad.data.zero_()
b.grad.data.zero_()
Run 10 epochs of batch gradient descent: bug data space is 1 iteration ahead of parameter space.
Train the Model: Stochastic Gradient Descent
Create a plot_error_surfaces object to visualize the data space and the parameter space during training:
Define train_model_SGD function for training the model.
# The function for training the model
LOSS_SGD = []
w = torch.tensor(-15.0, requires_grad = True)
b = torch.tensor(-10.0, requires_grad = True)
def train_model_SGD(iter):
# Loop
for epoch in range(iter):
# SGD is an approximation of out true total loss/cost, in this line of code we calculate our true loss/cost and store it
Yhat = forward(X)
# store the loss
LOSS_SGD.append(criterion(Yhat, Y).tolist())
for x, y in zip(X, Y):
# make a pridiction
yhat = forward(x)
# calculate the loss
loss = criterion(yhat, y)
# Section for plotting
get_surface.set_para_loss(w.data.tolist(), b.data.tolist(), loss.tolist())
# backward pass: compute gradient of the loss with respect to all the learnable parameters
loss.backward()
# update parameters slope and bias
w.data = w.data - lr * w.grad.data
b.data = b.data - lr * b.grad.data
# zero the gradients before running the backward pass
w.grad.data.zero_()
b.grad.data.zero_()
#plot surface and data space after each epoch
get_surface.plot_ps()
Run 10 epochs of stochastic gradient descent: bug data space is 1 iteration ahead of parameter space.
Compare the loss of both batch gradient descent as SGD.
SGD with Dataset DataLoader
Import the module for building a dataset class:
Create a dataset class:
Create a dataset object and check the length of the dataset.
Obtain the first training point:
Similarly, obtain the first three training points:
Create a plot_error_surfaces object to visualize the data space and the parameter space during training:
Create a DataLoader object by using the constructor:
Define train_model_DataLoader function for training the model.
# The function for training the model
w = torch.tensor(-15.0,requires_grad=True)
b = torch.tensor(-10.0,requires_grad=True)
LOSS_Loader = []
def train_model_DataLoader(epochs):
# Loop
for epoch in range(epochs):
# SGD is an approximation of out true total loss/cost, in this line of code we calculate our true loss/cost and store it
Yhat = forward(X)
# store the loss
LOSS_Loader.append(criterion(Yhat, Y).tolist())
for x, y in trainloader:
# make a prediction
yhat = forward(x)
# calculate the loss
loss = criterion(yhat, y)
# Section for plotting
get_surface.set_para_loss(w.data.tolist(), b.data.tolist(), loss.tolist())
# Backward pass: compute gradient of the loss with respect to all the learnable parameters
loss.backward()
# Updata parameters slope
w.data = w.data - lr * w.grad.data
b.data = b.data - lr* b.grad.data
# Clear gradients
w.grad.data.zero_()
b.grad.data.zero_()
#plot surface and data space after each epoch
get_surface.plot_ps()
Run 10 epochs of stochastic gradient descent: bug data space is 1 iteration ahead of parameter space.
Compare the loss of both batch gradient decent as SGD. Note that SGD converges to a minimum faster, that is, it decreases faster.
Practice
For practice, try to use SGD with DataLoader to train model with 10 iterations. Store the total loss in LOSS. We are going to use it in the next question.
Double-click here for the solution.
Plot the total loss
Double-click here for the solution.
About the Author:
Hi, My name is Juma Shafara. Am a Data Scientist and Instructor at DATAIDEA. I have taught hundreds of peope Programming, Data Analysis and Machine Learning.
I also enjoy developing innovative algorithms and models that can drive insights and value.
I regularly share some content that I find useful throughout my learning/teaching journey to simplify concepts in Machine Learning, Mathematics, Programming, and related topics on my website jumashafara.dataidea.org.
Besides these technical stuff, I enjoy watching soccer, movies and reading mystery books.