Missing Data
title: Handling Missing Data Quiz keywords: [Handling Missing Data Quiz, Handling Missing Data] author: Juma Shafara date: "2024-03"

Questions:
1. Which of the following is a common method for handling missing data?
- A) Deleting rows with missing values
- B) Using the mean to fill missing values
- C) Using a machine learning algorithm to predict missing values
- D) All of the above
2. What is the term used for removing rows or columns that contain missing data?
- A) Imputation
- B) Deletion
- C) Interpolation
- D) Normalization
3. Which imputation method replaces missing values with the mean, median, or mode?
- A) Random sampling imputation
- B) Regression imputation
- C) Central tendency imputation
- D) K-nearest neighbors imputation
4. What is the potential drawback of deleting rows with missing data?
- A) It is computationally expensive.
- B) It can lead to biased results.
- C) It always improves model accuracy.
- D) It requires complex algorithms.
5. Which technique involves predicting missing values based on other available data?
- A) Listwise deletion
- B) Pairwise deletion
- C) Multiple imputation
- D) Hot deck imputation
6. Which of the following is NOT a method for handling missing data in time series analysis?
- A) Forward fill
- B) Backward fill
- C) Interpolation
- D) Cross-validation
7. In the context of handling missing data, what does 'MCAR' stand for?
- A) Missing Completely at Random
- B) Missing Conditional on Available Rows
- C) Missing Characteristic Attribute Reduction
- D) Missing Completely Available Records
8. Which method is most suitable for handling missing data when the data is 'MAR' (Missing At Random)?
- A) Listwise deletion
- B) Multiple imputation
- C) Mean imputation
- D) Mode imputation
9. Which Python library is widely used for data manipulation and handling missing data?
- A) NumPy
- B) Pandas
- C) SciPy
- D) Matplotlib
10. Which of the following is a disadvantage of using mean imputation?
- A) It is computationally intensive.
- B) It can distort the variance of the data.
- C) It requires labeled data.
- D) It is only applicable to categorical data.
Answers:
- D) All of the above
- B) Deletion
- C) Central tendency imputation
- B) It can lead to biased results.
- C) Multiple imputation
- D) Cross-validation
- A) Missing Completely at Random
- B) Multiple imputation
- B) Pandas
- B) It can distort the variance of the data.