Classification Metrics
title: Classification Metrics Practice author: Juma Shafara date: "2024-02" date-modified: "2024-07-25" keywords: [machine learning, machine learning classification, machine learning classification metrics, decision trees, python, precision, recall, f1 score, weighted, accuracy, linear regression] description: Learn Programming for Data Science. Demonstrate loading, preparing, training, and evaluating a machine learning model using the Iris dataset

In this notebook, we'll walk through the process of building and evaluating a decision tree classifier using Scikit-Learn. We'll use the Iris dataset for demonstration and then provide an exercise to apply the same steps to the Wine dataset.
To be among the first to hear about future updates of the course materials, simply enter your email below, follow us on (formally Twitter), or subscribe to our YouTube channel.
Importing Necessary Libraries
First, we import the necessary libraries for data manipulation and loading the dataset.
numpyandpandasare imported for data manipulation.load_irisfromsklearn.datasetsis imported to load the Iris dataset.
Loading the Iris Dataset
The Iris dataset is loaded and stored in the variable iris.
Displaying Dataset Description
For a better understanding of the dataset, we can uncomment the following line to print the description of the Iris dataset.
Extracting Features and Target Variables
- X contains the feature data (sepal length, sepal width, petal length, petal width).
- y contains the target data (class labels: 0, 1, 2).
Importing Train-Test Split Function
train_test_split is imported to split the data into training and testing sets.
Splitting the Data
The dataset is split into training (70%) and testing (30%) sets.
Importing Decision Tree Classifier
Next, we import the Decision Tree classifier from Scikit-Learn.
Initializing the Classifier
We create an instance of the Decision Tree classifier
Training the Classifier
We train the classifier using the training data.
Making Predictions
We then make predictions on the test data using the the predict() method on the model
Importing Metrics for Evaluation
To evaluate our model, we import various metrics from Scikit-Learn.
Calculating Accuracy
Accuracy refers to the proportion of correctly predicted instances out of the total instances.
Calculating Precision
Precision is the ratio of correctly predicted positive observations to the total predicted positives.
Calculating Recall
Recall is the ratio of correctly predicted positive observations to all the actual positives.
Calculating F1 Score
The f1 score refers to the Harmonic mean of Precision and Recall.
Displaying the Classification Report
We can print the classification report, which provides precision, recall, F1-score, and support for each class.
The results show how well the model performs in classifying the iris species, with metrics providing insights into different aspects of the model's performance.
Exercise:
Perform the steps above using the wine dataset from sklearn