Data Exploration and Cleaning Exercise
- Load demo.xlsx dataset
- Rename the columns as suggested below
Old name | New name |
---|---|
Age | age |
Gender | gender |
Marital Status | marital_status |
Address | address |
Income | income |
Income Category | income_category |
Job Category | job_category |
- Display all the columns in the dataset
- Display some basic statistics about the numeric variables in the dataset
- Display some basic statistics about the categorical variables in the dataset
- What are the unique observations under gender?
- Can you fix any problems observed under the gender, give brief explanations why and how
- How many observations have 'no answer' for marital status?
- Write some piece of code to return only numeric variables from the dataset
- Are there any missing values in the dataset?
- Are there any outliers in the income variable?
- Investigate the relationship between age and income
- How many people earn more than 300 units?
- What data type is the marital status?
- Create dummy variables for gender
END