Handling Missing Data Quiz
Keywords
Handling Missing Data Quiz, Handling Missing Data
Questions:
1. Which of the following is a common method for handling missing data?
- Deleting rows with missing values
- Using the mean to fill missing values
- Using a machine learning algorithm to predict missing values
- All of the above
2. What is the term used for removing rows or columns that contain missing data?
- Imputation
- Deletion
- Interpolation
- Normalization
3. Which imputation method replaces missing values with the mean, median, or mode?
- Random sampling imputation
- Regression imputation
- Central tendency imputation
- K-nearest neighbors imputation
4. What is the potential drawback of deleting rows with missing data?
- It is computationally expensive.
- It can lead to biased results.
- It always improves model accuracy.
- It requires complex algorithms.
5. Which technique involves predicting missing values based on other available data?
- Listwise deletion
- Pairwise deletion
- Multiple imputation
- Hot deck imputation
6. Which of the following is NOT a method for handling missing data in time series analysis?
- Forward fill
- Backward fill
- Interpolation
- Cross-validation
7. In the context of handling missing data, what does ‘MCAR’ stand for?
- Missing Completely at Random
- Missing Conditional on Available Rows
- Missing Characteristic Attribute Reduction
- Missing Completely Available Records
8. Which method is most suitable for handling missing data when the data is ‘MAR’ (Missing At Random)?
- Listwise deletion
- Multiple imputation
- Mean imputation
- Mode imputation
9. Which Python library is widely used for data manipulation and handling missing data?
- NumPy
- Pandas
- SciPy
- Matplotlib
10. Which of the following is a disadvantage of using mean imputation?
- A) It is computationally intensive.
- B) It can distort the variance of the data.
- C) It requires labeled data.
- D) It is only applicable to categorical data.
Answers:
- All of the above
- Deletion
- Central tendency imputation
- It can lead to biased results.
- Multiple imputation
- Cross-validation
- Missing Completely at Random
- Multiple imputation
- Pandas
- It can distort the variance of the data.