title: Prebuilt Datasets and Transforms author: Juma Shafara date: "2023-09" date-modified: "2024-07-30" description: In this lab, you will use a prebuilt dataset and then use some prebuilt dataset transforms. keywords: []

Prebuilt Datasets and Transforms
Objective
- How to use MNIST prebuilt dataset in pytorch.
Table of Contents
In this lab, you will use a prebuilt dataset and then use some prebuilt dataset transforms.
Estimated Time Needed: 10 min
Preparation
The following are the libraries we are going to use for this lab. The torch.manual_seed() is for forcing the random function to give the same number every time we try to recompile it.
This is the function for displaying images.
Prebuilt Datasets
You will focus on the following libraries:
We can import a prebuilt dataset. In this case, use MNIST. You'll work with several of these parameters later by placing a transform object in the argument transform.
Each element of the dataset object contains a tuple. Let us see whether the first element in the dataset is a tuple and what is in it.
# Examine whether the elements in dataset MNIST are tuples, and what is in the tuple?
print("Type of the first element: ", type(dataset[0]))
print("The length of the tuple: ", len(dataset[0]))
print("The shape of the first element in the tuple: ", dataset[0][0].shape)
print("The type of the first element in the tuple", type(dataset[0][0]))
print("The second element in the tuple: ", dataset[0][1])
print("The type of the second element in the tuple: ", type(dataset[0][1]))
print("As the result, the structure of the first element in the dataset is (tensor([1, 28, 28]), tensor(7)).")
As shown in the output, the first element in the tuple is a cuboid tensor. As you can see, there is a dimension with only size 1, so basically, it is a rectangular tensor.
The second element in the tuple is a number tensor, which indicate the real number the image shows. As the second element in the tuple is tensor(7), the image should show a hand-written 7.
Let us plot the first element in the dataset:
As we can see, it is a 7.
Plot the second sample:
Torchvision Transforms
We can apply some image transform functions on the MNIST dataset.
As an example, the images in the MNIST dataset can be cropped and converted to a tensor. We can use transform.Compose we learned from the previous lab to combine the two transform functions.
# Combine two transforms: crop and convert to tensor. Apply the compose to MNIST dataset
croptensor_data_transform = transforms.Compose([transforms.CenterCrop(20), transforms.ToTensor()])
dataset = dsets.MNIST(root = './data', download = True, transform = croptensor_data_transform)
print("The shape of the first element in the first tuple: ", dataset[0][0].shape)
We can see the image is now 20 x 20 instead of 28 x 28.
Let us plot the first image again. Notice that the black space around the 7 become less apparent.
In the below example, we horizontally flip the image, and then convert it to a tensor. Use transforms.Compose() to combine these two transform functions. Plot the flipped image.
# Construct the compose. Apply it on MNIST dataset. Plot the image out.
fliptensor_data_transform = transforms.Compose([transforms.RandomHorizontalFlip(p = 1),transforms.ToTensor()])
dataset = dsets.MNIST(root = './data', download = True, transform = fliptensor_data_transform)
# show_data(dataset[1])
Practice
Try to use the RandomVerticalFlip (vertically flip the image) with horizontally flip and convert to tensor as a compose. Apply the compose on image. Use show_data() to plot the second image (the image as 2).
# Practice: Combine vertical flip, horizontal flip and convert to tensor as a compose. Apply the compose on image. Then plot the image
random_vertical_flip = transforms.Compose([transforms.RandomVerticalFlip(), transforms.ToTensor()])
dataset = dsets.MNIST(root= ".data", download=True, transform=fliptensor_data_transform)
show_data(dataset[0])
# Type your code here