author: Juma Shafara date: "2024-08-08" title: Convolution Neural Networks keywords: [Training Two Parameter, Mini-Batch Gradient Decent, Training Two Parameter Mini-Batch Gradient Decent] description: In this lab, you will review how to make a prediction in several different ways by using PyTorch.

Photo by DATAIDEA

Objective for this Notebook

Learn about Convolution.

Leran Determining the Size of Output.

Learn Stride, Zero Padding

\[linear \ equation :y=wx+b$$ $$linear\ equation\ with\ multiple \ variables \ where \ \mathbf{x} \ is \ a \ vector \ \mathbf{y}=\mathbf{wx}+b$$ $$ \ matrix\ multiplication \ where \ \mathbf{X} \ in \ a \ matrix \ \mathbf{y}=\mathbf{wX}+\mathbf{b} $$ $$\ convolution \ where \ \mathbf{X} \ and \ \mathbf{Y} \ is \ a \ tensor \ \mathbf{Y}=\mathbf{w}*\mathbf{X}+\mathbf{b}\]

In convolution, the parameter w is called a kernel. You can perform convolution on images where you let the variable image denote the variable X and w denote the parameter.

No description has been provided for this image

Create a two-dimensional convolution object by using the constructor Conv2d, the parameter in_channels and out_channels will be used for this section, and the parameter kernel_size will be three.

conv = nn.Conv2d(in_channels=1, out_channels=1,kernel_size=3)
conv

Conv2d(1, 1, kernel_size=(3, 3), stride=(1, 1))

Because the parameters in nn.Conv2d are randomly initialized and learned through training, give them some values.

conv.state_dict()['weight'][0][0]=torch.tensor([[1.0,0,-1.0],[2.0,0,-2.0],[1.0,0.0,-1.0]])
conv.state_dict()['bias'][0]=0.0
conv.state_dict()

OrderedDict([('weight',
              tensor([[[[ 1.,  0., -1.],
                        [ 2.,  0., -2.],
                        [ 1.,  0., -1.]]]])),
             ('bias', tensor([0.]))])

Create a dummy tensor to represent an image. The shape of the image is (1,1,5,5) where:

(number of inputs, number of channels, number of rows, number of columns )

Set the third column to 1:

image=torch.zeros(1,1,5,5)
image[0,0,:,2]=1
image

tensor([[[[0., 0., 1., 0., 0.],
          [0., 0., 1., 0., 0.],
          [0., 0., 1., 0., 0.],
          [0., 0., 1., 0., 0.],
          [0., 0., 1., 0., 0.]]]])

Call the object conv on the tensor image as an input to perform the convolution and assign the result to the tensor z.

z=conv(image)
z

tensor([[[[-4.,  0.,  4.],
          [-4.,  0.,  4.],
          [-4.,  0.,  4.]]]], grad_fn=<ConvolutionBackward0>)

The following animation illustrates the process, the kernel performs at the element-level multiplication on every element in the image in the corresponding region. The values are then added together. The kernel is then shifted and the process is repeated.

No description has been provided for this image

Determining the Size of the Output

The size of the output is an important parameter. In this lab, you will assume square images. For rectangular images, the same formula can be used in for each dimension independently.

Let M be the size of the input and K be the size of the kernel. The size of the output is given by the following formula:

\[M_{new}=M-K+1\]

Create a kernel of size 2:

K=2
conv1 = nn.Conv2d(in_channels=1, out_channels=1,kernel_size=K)
conv1.state_dict()['weight'][0][0]=torch.tensor([[1.0,1.0],[1.0,1.0]])
conv1.state_dict()['bias'][0]=0.0
conv1.state_dict()
conv1

Conv2d(1, 1, kernel_size=(2, 2), stride=(1, 1))

Create an image of size 2:

M=4
image1=torch.ones(1,1,M,M)

No description has been provided for this image

The following equation provides the output:

\[M_{new}=M-K+1$$ $$M_{new}=4-2+1$$ $$M_{new}=3\]

The following animation illustrates the process: The first iteration of the kernel overlay of the images produces one output. As the kernel is of size K, there are M-K elements for the kernel to move in the horizontal direction. The same logic applies to the vertical direction.

No description has been provided for this image

Perform the convolution and verify the size is correct:

z1=conv1(image1)
print("z1:",z1)
print("shape:",z1.shape[2:4])

z1: tensor([[[[4., 4., 4.],
          [4., 4., 4.],
          [4., 4., 4.]]]], grad_fn=<ConvolutionBackward0>)
shape: torch.Size([3, 3])

Stride parameter

The parameter stride changes the number of shifts the kernel moves per iteration. As a result, the output size also changes and is given by the following formula:

\[M_{new}=\dfrac{M-K}{stride}+1\]

Create a convolution object with a stride of 2:

conv3 = nn.Conv2d(in_channels=1, out_channels=1,kernel_size=2,stride=2)

conv3.state_dict()['weight'][0][0]=torch.tensor([[1.0,1.0],[1.0,1.0]])
conv3.state_dict()['bias'][0]=0.0
conv3.state_dict()

OrderedDict([('weight',
              tensor([[[[1., 1.],
                        [1., 1.]]]])),
             ('bias', tensor([0.]))])

For an image with a size of 4, calculate the output size:

\[M_{new}=\dfrac{M-K}{stride}+1$$ $$M_{new}=\dfrac{4-2}{2}+1$$ $$M_{new}=2\]

The following animation illustrates the process: The first iteration of the kernel overlay of the images produces one output. Because the kernel is of size K, there are M-K=2 elements. The stride is 2 because it will move 2 elements at a time. As a result, you divide M-K by the stride value 2:

No description has been provided for this image

Perform the convolution and verify the size is correct:

z3=conv3(image1)

print("z3:",z3)
print("shape:",z3.shape[2:4])

z3: tensor([[[[4., 4.],
          [4., 4.]]]], grad_fn=<ConvolutionBackward0>)
shape: torch.Size([2, 2])

Zero Padding

As you apply successive convolutions, the image will shrink. You can apply zero padding to keep the image at a reasonable size, which also holds information at the borders.

In addition, you might not get integer values for the size of the kernel. Consider the following image:

image1

tensor([[[[1., 1., 1., 1.],
          [1., 1., 1., 1.],
          [1., 1., 1., 1.],
          [1., 1., 1., 1.]]]])

Try performing convolutions with the kernel_size=2 and a stride=3. Use these values:

\[M_{new}=\dfrac{M-K}{stride}+1$$ $$M_{new}=\dfrac{4-2}{3}+1$$ $$M_{new}=1.666\]

conv4 = nn.Conv2d(in_channels=1, out_channels=1,kernel_size=2,stride=3)
conv4.state_dict()['weight'][0][0]=torch.tensor([[1.0,1.0],[1.0,1.0]])
conv4.state_dict()['bias'][0]=0.0
conv4.state_dict()
z4=conv4(image1)
print("z4:",z4)
print("z4:",z4.shape[2:4])

z4: tensor([[[[4.]]]], grad_fn=<ConvolutionBackward0>)
z4: torch.Size([1, 1])

You can add rows and columns of zeros around the image. This is called padding. In the constructor Conv2d, you specify the number of rows or columns of zeros that you want to add with the parameter padding.

For a square image, you merely pad an extra column of zeros to the first column and the last column. Repeat the process for the rows. As a result, for a square image, the width and height is the original size plus 2 x the number of padding elements specified. You can then determine the size of the output after subsequent operations accordingly as shown in the following equation where you determine the size of an image after padding and then applying a convolutions kernel of size K.

\[M'=M+2 \times padding$$ $$M_{new}=M'-K+1\]

Consider the following example:

conv5 = nn.Conv2d(in_channels=1, out_channels=1,kernel_size=2,stride=3,padding=1)

conv5.state_dict()['weight'][0][0]=torch.tensor([[1.0,1.0],[1.0,1.0]])
conv5.state_dict()['bias'][0]=0.0
conv5.state_dict()
z5=conv5(image1)
print("z5:",z5)
print("z5:",z4.shape[2:4])

z5: tensor([[[[1., 2.],
          [2., 4.]]]], grad_fn=<ConvolutionBackward0>)
z5: torch.Size([1, 1])

The process is summarized in the following animation:

No description has been provided for this image

Practice Question

A kernel of zeros with a kernel size=3 is applied to the following image:

Image=torch.randn((1,1,4,4))
Image

tensor([[[[-0.4460, -0.1425,  1.0888,  0.8292],
          [ 1.0301, -0.4119, -1.0132, -0.4925],
          [-1.1662, -0.5480,  1.7078,  0.0230],
          [-0.1644,  1.8086, -1.1509, -0.2585]]]])

Question: Without using the function, determine what the outputs values are as each element:

Double-click here for the solution.

Question: Use the following convolution object to perform convolution on the tensor Image:

conv = nn.Conv2d(in_channels=1, out_channels=1,kernel_size=3)
conv.state_dict()['weight'][0][0]=torch.tensor([[0,0,0],[0,0,0],[0,0.0,0]])
conv.state_dict()['bias'][0]=0.0

Double-click here for the solution.

Question: You have an image of size 4. The parameters are as follows kernel_size=2,stride=2. What is the size of the output?

Objective for this Notebook

Learn about Convolution.

Leran Determining the Size of Output.

Learn Stride, Zero Padding

Table of Contents

Preparation

What is Convolution?

Determining the Size of the Output

Stride parameter

Zero Padding

Practice Question

What's on your mind? Put it in the comments!

Objective for this Notebook

Learn about Convolution. Leran Determining the Size of Output. Learn Stride, Zero Padding

Table of Contents

Don't Miss Any Updates!

Preparation

What is Convolution?

Determining the Size of the Output

Stride parameter

Zero Padding

Practice Question

What's on your mind? Put it in the comments!

Learn about Convolution.

Leran Determining the Size of Output.

Learn Stride, Zero Padding