Pipeline
title: Machine Learning Pipelining Quiz keywords: [machine learning, machine learning classification, machine learning classification metrics, decision trees, python, precision, recall, f1 score, weighted, accuracy, linear regression] description: Here are some multiple choice and true/false questions on machine learning pipelining author: Juma Shafara date: "2024-06"

Here are some multiple choice and true/false questions on machine learning pipelining:
Multiple Choice Questions
- What is the primary purpose of a machine learning pipeline?
- A. To visualize data
- B. To automate the workflow of data processing and model training
- C. To analyze data manually
- D. To store data securely
Reveal answer
B. To automate the workflow of data processing and model training
- Which of the following steps is typically the first in a machine learning pipeline?
- A. Model evaluation
- B. Data preprocessing
- C. Model deployment
- D. Hyperparameter tuning
Reveal answer
B. Data preprocessing
- In a scikit-learn pipeline, what does the
StandardScalerdo? - A. Select features
- B. Scale features to a standard normal distribution
- C. Reduce the dimensionality of data
- D. Train the model
Reveal answer
B. Scale features to a standard normal distribution
- Which of the following is an advantage of using pipelines?
- A. They make code less readable
- B. They ensure reproducibility
- C. They slow down model training
- D. They increase the risk of data leakage
Reveal answer
B. They ensure reproducibility
- Which step in a machine learning pipeline is responsible for improving the model by adjusting its parameters?
- A. Data preprocessing
- B. Model training
- C. Hyperparameter tuning
- D. Model evaluation
Reveal answer
C. Hyperparameter tuning
True or False Questions
- Pipelines in scikit-learn can only include pre-built transformers and estimators.
Reveal answer
False
- Using a pipeline ensures that the same data transformations are applied during both training and testing phases.
Reveal answer
True
- You can use GridSearchCV with a pipeline to perform hyperparameter tuning on multiple steps simultaneously.
Reveal answer
True
- The steps in a machine learning pipeline must be specified in a particular order.
Reveal answer
True
- A machine learning pipeline can be saved to disk using joblib or pickle in Python.
Reveal answer
True
- Transformers in a pipeline are fit using the training data and then applied to the test data.
Reveal answer
True
- Model evaluation is typically done before model training in a pipeline.
Reveal answer
False
- A pipeline helps in avoiding data leakage by ensuring proper separation of training and testing data transformations.
Reveal answer
True
- Pipelines cannot be used for text data processing.
Reveal answer
False
- Feature extraction can be included as a step in a machine learning pipeline.
Reveal answer
True