Welcome to our new Data Science resource site!

logo
Programming for Data Science
Initialization
Initializing search
    • Home
    • Python
    • Data Collection and Visualization
    • Machine Learning
    • Deep Learning
    • Time Series
    • Maths & Statistics
    • Extras
    • About
    • Home
      • Overview
      • End of Course Exercise
        • Outline
        • Introduction
        • Variables
        • Numbers
        • Strings
        • Operators
        • Containers
        • Flow Control
        • Advanced
        • Modules
        • File Handling
        • End of Course Exercise
        • Filter Function
        • Map Function
        • Reduce Function
        • NumPy Crash Course
        • Pandas Crash Course
        • Matplotlib Crash Course
      • NumPy Crash Course
      • Pandas Crash Course
      • Weather Data
      • Matplotlib Crash Course
      • Data Exploration Exercise
      • Handling Missing Data
      • Overview
      • Training Models
        • Introduction
        • Advanced
        • Feature Selection
        • Why Scaling
        • Feature Scaling (FPL)
        • Normalization and Standardization
      • Handling Missing Data
        • Classification Metrics
        • Regression Metrics
        • Pipelines
        • Hyperparameter Tuning
      • Introduction
        • 1D Tensors
        • 2D Tensors
        • Derivatives & Graphs
        • Simple Datasets
        • Pre-Built Datasets
        • Exercise
        • 1D Regression
        • One Parameter
        • Slope & Bias
        • Exercise
        • SGD
        • Mini-Batch GD
        • PyTorch Way
        • Training & Validation
        • Exercise
        • Multiple LR Prediction
        • Multiple LR Training
        • Multi-Target LR
        • Training Multiple Output
        • Exercise
        • Prediction
        • MSE Issues
        • Cross Entropy
        • Softmax
        • Exercise
        • Custom Datasets
        • DataLoaders
        • Transforms
        • Simple Hidden Layer
        • this is for exercises
        • XOR Problem
        • MNIST
        • Activation Functions
        • MNIST One Layer
        • MNIST Two Layer
        • Multiclass Spiral
        • Dropout Prediction
        • Dropout Regression
        • Initialization
        • Xavier Init
        • He Init
        • Momentum
        • NN with Momentum
        • Batch Normalization
        • Convolution Basics
        • Activation & Pooling
        • Multiple Channels
        • Simple CNN
        • CNN Small Image
        • CNN Batch Processing
      • Introduction
      • Analysis
      • Forecasting
      • Python Example
      • Overview
      • Eigen Values and Vectors
      • Descriptive Statistics
      • Inferential Statistics
      • Statistical Models
      • Hypothesis Testing
      • Customer Analysis
      • How KNN Works
      • Handling Imbalanced Data
      • Classification Metrics
      • License
      • ReadMe

    Initialization with Same Weights

    Objective for this Notebook

    1. Learn hw to Define the Neural Network with Same Weights Initialization define Criterion Function, Optimizer, and Train the Model
    2.Define the Neural Network with defult Weights Initialization define Criterion Function, Optimizer
    3. Train the Model

    Table of Contents

    In this lab, we will see the problem of initializing the weights with the same value. We will see that even for a simple network, our model will not train properly. .

    • Neural Network Module and Training Function
    • Make Some Data
    • Define the Neural Network with Same Weights Initialization define Criterion Function, Optimizer, and Train the Model
    • Define the Neural Network with defult Weights Initialization define Criterion Function, Optimizer, and Train the Model

    Estimated Time Needed: 25 min


    Preparation

    We'll need the following libraries

    In [ ]:
    Copied!
    # Import the libraries we need for this lab
    
    import torch 
    import torch.nn as nn
    from torch import sigmoid
    import matplotlib.pylab as plt
    import numpy as np
    torch.manual_seed(0)
    
    # Import the libraries we need for this lab import torch import torch.nn as nn from torch import sigmoid import matplotlib.pylab as plt import numpy as np torch.manual_seed(0)

    Used for plotting the model

    In [ ]:
    Copied!
    # The function for plotting the model
    
    def PlotStuff(X, Y, model, epoch, leg=True):
        
        plt.plot(X.numpy(), model(X).detach().numpy(), label=('epoch ' + str(epoch)))
        plt.plot(X.numpy(), Y.numpy(), 'r')
        plt.xlabel('x')
        if leg == True:
            plt.legend()
        else:
            pass
    
    # The function for plotting the model def PlotStuff(X, Y, model, epoch, leg=True): plt.plot(X.numpy(), model(X).detach().numpy(), label=('epoch ' + str(epoch))) plt.plot(X.numpy(), Y.numpy(), 'r') plt.xlabel('x') if leg == True: plt.legend() else: pass

    Neural Network Module and Training Function

    Define the activations and the output of the first linear layer as an attribute. Note that this is not good practice.

    In [ ]:
    Copied!
    # Define the class Net
    
    class Net(nn.Module):
        
        # Constructor
        def __init__(self, D_in, H, D_out):
            super(Net, self).__init__()
            # hidden layer 
            self.linear1 = nn.Linear(D_in, H)
            self.linear2 = nn.Linear(H, D_out)
            # Define the first linear layer as an attribute, this is not good practice
            self.a1 = None
            self.l1 = None
            self.l2=None
        
        # Prediction
        def forward(self, x):
            self.l1 = self.linear1(x)
            self.a1 = sigmoid(self.l1)
            self.l2=self.linear2(self.a1)
            yhat = sigmoid(self.linear2(self.a1))
            return yhat
    
    # Define the class Net class Net(nn.Module): # Constructor def __init__(self, D_in, H, D_out): super(Net, self).__init__() # hidden layer self.linear1 = nn.Linear(D_in, H) self.linear2 = nn.Linear(H, D_out) # Define the first linear layer as an attribute, this is not good practice self.a1 = None self.l1 = None self.l2=None # Prediction def forward(self, x): self.l1 = self.linear1(x) self.a1 = sigmoid(self.l1) self.l2=self.linear2(self.a1) yhat = sigmoid(self.linear2(self.a1)) return yhat

    Define the training function:

    In [ ]:
    Copied!
    # Define the training function
    
    def train(Y, X, model, optimizer, criterion, epochs=1000):
        cost = []
        total=0
        for epoch in range(epochs):
            total=0
            for y, x in zip(Y, X):
                yhat = model(x)
                loss = criterion(yhat, y)
                loss.backward()
                optimizer.step()
                optimizer.zero_grad()
                #cumulative loss 
                total+=loss.item() 
            cost.append(total)
            if epoch % 300 == 0:    
                PlotStuff(X, Y, model, epoch, leg=True)
                plt.show()
                model(X)
                plt.scatter(model.a1.detach().numpy()[:, 0], model.a1.detach().numpy()[:, 1], c=Y.numpy().reshape(-1))
                plt.title('activations')
                plt.show()
        return cost
    
    # Define the training function def train(Y, X, model, optimizer, criterion, epochs=1000): cost = [] total=0 for epoch in range(epochs): total=0 for y, x in zip(Y, X): yhat = model(x) loss = criterion(yhat, y) loss.backward() optimizer.step() optimizer.zero_grad() #cumulative loss total+=loss.item() cost.append(total) if epoch % 300 == 0: PlotStuff(X, Y, model, epoch, leg=True) plt.show() model(X) plt.scatter(model.a1.detach().numpy()[:, 0], model.a1.detach().numpy()[:, 1], c=Y.numpy().reshape(-1)) plt.title('activations') plt.show() return cost

    Make Some Data

    In [ ]:
    Copied!
    # Make some data
    
    X = torch.arange(-20, 20, 1).view(-1, 1).type(torch.FloatTensor)
    Y = torch.zeros(X.shape[0])
    Y[(X[:, 0] > -4) & (X[:, 0] < 4)] = 1.0
    
    # Make some data X = torch.arange(-20, 20, 1).view(-1, 1).type(torch.FloatTensor) Y = torch.zeros(X.shape[0]) Y[(X[:, 0] > -4) & (X[:, 0] < 4)] = 1.0

    Define the Neural Network with Same Weights Initialization define, Criterion Function, Optimizer and Train the Model

    Create the Cross-Entropy loss function:

    In [ ]:
    Copied!
    # The loss function
    
    def criterion_cross(outputs, labels):
        out = -1 * torch.mean(labels * torch.log(outputs) + (1 - labels) * torch.log(1 - outputs))
        return out
    
    # The loss function def criterion_cross(outputs, labels): out = -1 * torch.mean(labels * torch.log(outputs) + (1 - labels) * torch.log(1 - outputs)) return out

    Define the Neural Network

    In [ ]:
    Copied!
    # Train the model
    # size of input 
    D_in = 1
    # size of hidden layer 
    H = 2
    # number of outputs 
    D_out = 1
    # learning rate 
    learning_rate = 0.1
    # create the model 
    model = Net(D_in, H, D_out)
    
    # Train the model # size of input D_in = 1 # size of hidden layer H = 2 # number of outputs D_out = 1 # learning rate learning_rate = 0.1 # create the model model = Net(D_in, H, D_out)

    This is the PyTorch default installation

    In [ ]:
    Copied!
    model.state_dict()
    
    model.state_dict()

    Same Weights Initialization with all ones for weights and zeros for the bias.

    In [ ]:
    Copied!
    model.state_dict()['linear1.weight'][0]=1.0
    model.state_dict()['linear1.weight'][1]=1.0
    model.state_dict()['linear1.bias'][0]=0.0
    model.state_dict()['linear1.bias'][1]=0.0
    model.state_dict()['linear2.weight'][0]=1.0
    model.state_dict()['linear2.bias'][0]=0.0
    model.state_dict()
    
    model.state_dict()['linear1.weight'][0]=1.0 model.state_dict()['linear1.weight'][1]=1.0 model.state_dict()['linear1.bias'][0]=0.0 model.state_dict()['linear1.bias'][1]=0.0 model.state_dict()['linear2.weight'][0]=1.0 model.state_dict()['linear2.bias'][0]=0.0 model.state_dict()

    Optimizer, and Train the Model:

    In [ ]:
    Copied!
    #optimizer 
    optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
    #train the model usein
    cost_cross = train(Y, X, model, optimizer, criterion_cross, epochs=1000)
    #plot the loss
    plt.plot(cost_cross)
    plt.xlabel('epoch')
    plt.title('cross entropy loss')
    
    #optimizer optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate) #train the model usein cost_cross = train(Y, X, model, optimizer, criterion_cross, epochs=1000) #plot the loss plt.plot(cost_cross) plt.xlabel('epoch') plt.title('cross entropy loss')

    By examining the output of the paramters all thought they have changed they are identical.

    In [ ]:
    Copied!
    model.state_dict()
    
    model.state_dict()
    In [ ]:
    Copied!
    yhat=model(torch.tensor([[-2.0],[0.0],[2.0]]))
    yhat
    
    yhat=model(torch.tensor([[-2.0],[0.0],[2.0]])) yhat

    Define the Neural Network, Criterion Function, Optimizer and Train the Model

    In [ ]:
    Copied!
    # Train the model
    # size of input 
    D_in = 1
    # size of hidden layer 
    H = 2
    # number of outputs 
    D_out = 1
    # learning rate 
    learning_rate = 0.1
    # create the model 
    model = Net(D_in, H, D_out)
    
    # Train the model # size of input D_in = 1 # size of hidden layer H = 2 # number of outputs D_out = 1 # learning rate learning_rate = 0.1 # create the model model = Net(D_in, H, D_out)

    Repeat the previous steps above by using the MSE cost or total loss:

    In [ ]:
    Copied!
    #optimizer 
    optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
    #train the model usein
    cost_cross = train(Y, X, model, optimizer, criterion_cross, epochs=1000)
    #plot the loss
    plt.plot(cost_cross)
    plt.xlabel('epoch')
    plt.title('cross entropy loss')
    
    #optimizer optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate) #train the model usein cost_cross = train(Y, X, model, optimizer, criterion_cross, epochs=1000) #plot the loss plt.plot(cost_cross) plt.xlabel('epoch') plt.title('cross entropy loss')

    Double-click here for the solution.

    What's on your mind? Put it in the comments!

    June 3, 2025 June 3, 2025

    © 2025 DATAIDEA. All rights reserved. Built with ❤️ by Juma Shafara.