Welcome to our new Data Science resource site!

logo
Programming for Data Science
Dropout Regression
Initializing search
    • Home
    • Python
    • Data Collection and Visualization
    • Machine Learning
    • Deep Learning
    • Time Series
    • Maths & Statistics
    • Extras
    • About
    • Home
      • Overview
      • End of Course Exercise
        • Outline
        • Introduction
        • Variables
        • Numbers
        • Strings
        • Operators
        • Containers
        • Flow Control
        • Advanced
        • Modules
        • File Handling
        • End of Course Exercise
        • Filter Function
        • Map Function
        • Reduce Function
        • NumPy Crash Course
        • Pandas Crash Course
        • Matplotlib Crash Course
      • NumPy Crash Course
      • Pandas Crash Course
      • Weather Data
      • Matplotlib Crash Course
      • Data Exploration Exercise
      • Handling Missing Data
      • Overview
      • Training Models
        • Introduction
        • Advanced
        • Feature Selection
        • Why Scaling
        • Feature Scaling (FPL)
        • Normalization and Standardization
      • Handling Missing Data
        • Classification Metrics
        • Regression Metrics
        • Pipelines
        • Hyperparameter Tuning
      • Introduction
        • 1D Tensors
        • 2D Tensors
        • Derivatives & Graphs
        • Simple Datasets
        • Pre-Built Datasets
        • Exercise
        • 1D Regression
        • One Parameter
        • Slope & Bias
        • Exercise
        • SGD
        • Mini-Batch GD
        • PyTorch Way
        • Training & Validation
        • Exercise
        • Multiple LR Prediction
        • Multiple LR Training
        • Multi-Target LR
        • Training Multiple Output
        • Exercise
        • Prediction
        • MSE Issues
        • Cross Entropy
        • Softmax
        • Exercise
        • Custom Datasets
        • DataLoaders
        • Transforms
        • Simple Hidden Layer
        • this is for exercises
        • XOR Problem
        • MNIST
        • Activation Functions
        • MNIST One Layer
        • MNIST Two Layer
        • Multiclass Spiral
        • Dropout Prediction
        • Dropout Regression
        • Initialization
        • Xavier Init
        • He Init
        • Momentum
        • NN with Momentum
        • Batch Normalization
        • Convolution Basics
        • Activation & Pooling
        • Multiple Channels
        • Simple CNN
        • CNN Small Image
        • CNN Batch Processing
      • Introduction
      • Analysis
      • Forecasting
      • Python Example
      • Overview
      • Eigen Values and Vectors
      • Descriptive Statistics
      • Inferential Statistics
      • Statistical Models
      • Hypothesis Testing
      • Customer Analysis
      • How KNN Works
      • Handling Imbalanced Data
      • Classification Metrics
      • License
      • ReadMe

    Using Dropout in Regression

    Objective for this Notebook

    1. Create the Model and Cost Function the PyTorch way.
    2. Learn Batch Gradient Descent

    Table of Contents

    In this lab, you will see how adding dropout to your model will decrease overfitting.

    • Make Some Data
    • Create the Model and Cost Function the PyTorch way
    • Batch Gradient Descent

    Estimated Time Needed: 20 min


    Preparation

    We'll need the following libraries

    In [ ]:
    Copied!
    # Import the libraries we need for the lab
    
    import torch
    import matplotlib.pyplot as plt
    import torch.nn as nn
    import torch.nn.functional as F
    import numpy as np
    from torch.utils.data import Dataset, DataLoader
    
    torch.manual_seed(0) 
    
    # Import the libraries we need for the lab import torch import matplotlib.pyplot as plt import torch.nn as nn import torch.nn.functional as F import numpy as np from torch.utils.data import Dataset, DataLoader torch.manual_seed(0)

    Make Some Data

    Create polynomial dataset class:

    In [ ]:
    Copied!
    # Create Data object
    
    class Data(Dataset):
        
        # Constructor
        def __init__(self, N_SAMPLES=40, noise_std=1, train=True):
            self.x = torch.linspace(-1, 1, N_SAMPLES).view(-1, 1)
            self.f = self.x ** 2
            if train != True:
                torch.manual_seed(1)
                self.y = self.f + noise_std * torch.randn(self.f.size())
                self.y = self.y.view(-1, 1)
                torch.manual_seed(0)
            else:
                self.y = self.f + noise_std * torch.randn(self.f.size())
                self.y = self.y.view(-1, 1)
                
        # Getter
        def __getitem__(self, index):    
            return self.x[index], self.y[index]
        
        # Get Length
        def __len__(self):
            return self.len
        
        # Plot the data
        def plot(self):
            plt.figure(figsize = (6.1, 10))
            plt.scatter(self.x.numpy(), self.y.numpy(), label="Samples")
            plt.plot(self.x.numpy(), self.f.numpy() ,label="True Function", color='orange')
            plt.xlabel("x")
            plt.ylabel("y")
            plt.xlim((-1, 1))
            plt.ylim((-2, 2.5))
            plt.legend(loc="best")
            plt.show()
    
    # Create Data object class Data(Dataset): # Constructor def __init__(self, N_SAMPLES=40, noise_std=1, train=True): self.x = torch.linspace(-1, 1, N_SAMPLES).view(-1, 1) self.f = self.x ** 2 if train != True: torch.manual_seed(1) self.y = self.f + noise_std * torch.randn(self.f.size()) self.y = self.y.view(-1, 1) torch.manual_seed(0) else: self.y = self.f + noise_std * torch.randn(self.f.size()) self.y = self.y.view(-1, 1) # Getter def __getitem__(self, index): return self.x[index], self.y[index] # Get Length def __len__(self): return self.len # Plot the data def plot(self): plt.figure(figsize = (6.1, 10)) plt.scatter(self.x.numpy(), self.y.numpy(), label="Samples") plt.plot(self.x.numpy(), self.f.numpy() ,label="True Function", color='orange') plt.xlabel("x") plt.ylabel("y") plt.xlim((-1, 1)) plt.ylim((-2, 2.5)) plt.legend(loc="best") plt.show()

    Create a dataset object:

    In [ ]:
    Copied!
    # Create the dataset object and plot the dataset
    
    data_set = Data()
    data_set.plot()
    
    # Create the dataset object and plot the dataset data_set = Data() data_set.plot()

    Get some validation data:

    In [ ]:
    Copied!
    # Create validation dataset object
    
    validation_set = Data(train=False)
    
    # Create validation dataset object validation_set = Data(train=False)

    Create the Model, Optimizer, and Total Loss Function (Cost)

    Create a custom module with three layers. in_size is the size of the input features, n_hidden is the size of the layers, and out_size is the size. p is dropout probability. The default is 0 which is no dropout.

    In [ ]:
    Copied!
    # Create the class for model
    
    class Net(nn.Module):
        
        # Constructor
        def __init__(self, in_size, n_hidden, out_size, p=0):
            super(Net, self).__init__()
            self.drop = nn.Dropout(p=p)
            self.linear1 = nn.Linear(in_size, n_hidden)
            self.linear2 = nn.Linear(n_hidden, n_hidden)
            self.linear3 = nn.Linear(n_hidden, out_size)
            
        def forward(self, x):
            x = F.relu(self.drop(self.linear1(x)))
            x = F.relu(self.drop(self.linear2(x)))
            x = self.linear3(x)
            return x
    
    # Create the class for model class Net(nn.Module): # Constructor def __init__(self, in_size, n_hidden, out_size, p=0): super(Net, self).__init__() self.drop = nn.Dropout(p=p) self.linear1 = nn.Linear(in_size, n_hidden) self.linear2 = nn.Linear(n_hidden, n_hidden) self.linear3 = nn.Linear(n_hidden, out_size) def forward(self, x): x = F.relu(self.drop(self.linear1(x))) x = F.relu(self.drop(self.linear2(x))) x = self.linear3(x) return x

    Create two model objects: model had no dropout, and model_drop has a dropout probability of 0.5:

    In [ ]:
    Copied!
    # Create the model objects
    
    model = Net(1, 300, 1)
    model_drop = Net(1, 300, 1, p=0.5)
    
    # Create the model objects model = Net(1, 300, 1) model_drop = Net(1, 300, 1, p=0.5)

    Train the Model via Mini-Batch Gradient Descent

    Set the model using dropout to training mode; this is the default mode, but it's good practice.

    In [ ]:
    Copied!
    # Set the model to train mode
    
    model_drop.train()
    
    # Set the model to train mode model_drop.train()

    Train the model by using the Adam optimizer. See the unit on other optimizers. Use the mean square loss:

    In [ ]:
    Copied!
    # Set the optimizer and criterion function
    
    optimizer_ofit = torch.optim.Adam(model.parameters(), lr=0.01)
    optimizer_drop = torch.optim.Adam(model_drop.parameters(), lr=0.01)
    criterion = torch.nn.MSELoss()
    
    # Set the optimizer and criterion function optimizer_ofit = torch.optim.Adam(model.parameters(), lr=0.01) optimizer_drop = torch.optim.Adam(model_drop.parameters(), lr=0.01) criterion = torch.nn.MSELoss()

    Initialize a dictionary that stores the training and validation loss for each model:

    In [ ]:
    Copied!
    # Initialize the dict to contain the loss results
    
    LOSS={}
    LOSS['training data no dropout']=[]
    LOSS['validation data no dropout']=[]
    LOSS['training data dropout']=[]
    LOSS['validation data dropout']=[]
    
    # Initialize the dict to contain the loss results LOSS={} LOSS['training data no dropout']=[] LOSS['validation data no dropout']=[] LOSS['training data dropout']=[] LOSS['validation data dropout']=[]

    Run 500 iterations of batch gradient descent:

    In [ ]:
    Copied!
    # Train the model
    
    epochs = 500
    
    def train_model(epochs):
        for epoch in range(epochs):
            yhat = model(data_set.x)
            yhat_drop = model_drop(data_set.x)
            loss = criterion(yhat, data_set.y)
            loss_drop = criterion(yhat_drop, data_set.y)
    
            #store the loss for  both the training and validation  data for both models 
            LOSS['training data no dropout'].append(loss.item())
            LOSS['validation data no dropout'].append(criterion(model(validation_set.x), validation_set.y).item())
            LOSS['training data dropout'].append(loss_drop.item())
            model_drop.eval()
            LOSS['validation data dropout'].append(criterion(model_drop(validation_set.x), validation_set.y).item())
            model_drop.train()
    
            optimizer_ofit.zero_grad()
            optimizer_drop.zero_grad()
            loss.backward()
            loss_drop.backward()
            optimizer_ofit.step()
            optimizer_drop.step()
            
    train_model(epochs)
    
    # Train the model epochs = 500 def train_model(epochs): for epoch in range(epochs): yhat = model(data_set.x) yhat_drop = model_drop(data_set.x) loss = criterion(yhat, data_set.y) loss_drop = criterion(yhat_drop, data_set.y) #store the loss for both the training and validation data for both models LOSS['training data no dropout'].append(loss.item()) LOSS['validation data no dropout'].append(criterion(model(validation_set.x), validation_set.y).item()) LOSS['training data dropout'].append(loss_drop.item()) model_drop.eval() LOSS['validation data dropout'].append(criterion(model_drop(validation_set.x), validation_set.y).item()) model_drop.train() optimizer_ofit.zero_grad() optimizer_drop.zero_grad() loss.backward() loss_drop.backward() optimizer_ofit.step() optimizer_drop.step() train_model(epochs)

    Set the model with dropout to evaluation mode:

    In [ ]:
    Copied!
    # Set the model with dropout to evaluation mode
    
    model_drop.eval()
    
    # Set the model with dropout to evaluation mode model_drop.eval()

    Make a prediction by using both models:

    In [ ]:
    Copied!
    # Make the prediction
    
    yhat = model(data_set.x)
    yhat_drop = model_drop(data_set.x)
    
    # Make the prediction yhat = model(data_set.x) yhat_drop = model_drop(data_set.x)

    Plot predictions of both models. Compare them to the training points and the true function:

    In [ ]:
    Copied!
    # Plot the predictions for both models
    
    plt.figure(figsize=(6.1, 10))
    
    plt.scatter(data_set.x.numpy(), data_set.y.numpy(), label="Samples")
    plt.plot(data_set.x.numpy(), data_set.f.numpy(), label="True function", color='orange')
    plt.plot(data_set.x.numpy(), yhat.detach().numpy(), label='no dropout', c='r')
    plt.plot(data_set.x.numpy(), yhat_drop.detach().numpy(), label="dropout", c ='g')
    
    plt.xlabel("x")
    plt.ylabel("y")
    plt.xlim((-1, 1))
    plt.ylim((-2, 2.5))
    plt.legend(loc = "best")
    plt.show()
    
    # Plot the predictions for both models plt.figure(figsize=(6.1, 10)) plt.scatter(data_set.x.numpy(), data_set.y.numpy(), label="Samples") plt.plot(data_set.x.numpy(), data_set.f.numpy(), label="True function", color='orange') plt.plot(data_set.x.numpy(), yhat.detach().numpy(), label='no dropout', c='r') plt.plot(data_set.x.numpy(), yhat_drop.detach().numpy(), label="dropout", c ='g') plt.xlabel("x") plt.ylabel("y") plt.xlim((-1, 1)) plt.ylim((-2, 2.5)) plt.legend(loc = "best") plt.show()

    You can see that the model using dropout does better at tracking the function that generated the data. We use the log to make the difference more apparent

    Plot out the loss for training and validation data on both models:

    In [ ]:
    Copied!
    # Plot the loss
    
    plt.figure(figsize=(6.1, 10))
    for key, value in LOSS.items():
        plt.plot(np.log(np.array(value)), label=key)
        plt.legend()
        plt.xlabel("iterations")
        plt.ylabel("Log of cost or total loss")
    
    # Plot the loss plt.figure(figsize=(6.1, 10)) for key, value in LOSS.items(): plt.plot(np.log(np.array(value)), label=key) plt.legend() plt.xlabel("iterations") plt.ylabel("Log of cost or total loss")

    You see that the model without dropout performs better on the training data, but it performs worse on the validation data. This suggests overfitting. However, the model using dropout performs better on the validation data, but worse on the training data.

    What's on your mind? Put it in the comments!

    June 3, 2025 June 3, 2025

    © 2025 DATAIDEA. All rights reserved. Built with ❤️ by Juma Shafara.