Regression Metrics
title: Regression Metrics keywords: [Regression Metrics, Mean Absolute Error, Mean Square Error, Root Mean Squared Error, R-Squared (Coefficient of Determination)] description: Learn several metrics to evaluate the performance of regression models author: Juma Shafara date: "2024-03"

In regression tasks, the goal is to predict continuous numerical values. Scikit-learn provides several metrics to evaluate the performance of regression models. In this notebook, we will look at the following
- Mean Absolute Error
- Mean Square Error
- Root Mean Squared Error
- R-Squared (Coefficient of Determination)
Mean Absolute Error (MAE):
- MAE measures the average absolute errors between predicted values and actual values.
- Imagine you're trying to hit a target with darts. The MAE is like calculating the average distance between where your darts hit and the bullseye. You just sum up how far each dart landed from the center (without caring if it was too short or too far) and then find the average. The smaller the MAE, the closer your predictions are to the actual values.
- Formula: \(\(\text{MAE} = \frac{1}{n} \sum_{i=1}^{n} |y_{\text{true}} - y_{\text{pred}}|\)\)
Mean Squared Error (MSE):
- MSE measures the average of the squares of the errors between predicted values and actual values.
- This is similar to MAE, but instead of just adding up the distances, you square them before averaging. Squaring makes bigger differences more noticeable (by making them even bigger), so MSE penalizes larger errors more than smaller ones.
- Formula: \(\(\text{MSE} = \frac{1}{n} \sum_{i=1}^{n} (y_{\text{true}} - y_{\text{pred}})^2\)\)
Root Mean Squared Error (RMSE):
- RMSE is the square root of the MSE, providing a more interpretable scale since it's in the same units as the target variable.
- It's just like MSE, but we take the square root of the result. This brings the error back to the same scale as the original target variable, which makes it easier to interpret. RMSE gives you an idea of how spread out your errors are in the same units as your data.
- Formula: \(\(\text{RMSE} = \sqrt{\text{MSE}}\)\)
R-squared (Coefficient of Determination)**:
- R-squared measures the proportion of the variance in the dependent variable that is predictable from the independent variables.
- This tells you how well your model's predictions match the actual data compared to a simple average. If R-squared is 1, it means your model perfectly predicts the target variable. If it's 0, it means your model is no better than just predicting the mean of the target variable. So, the closer R-squared is to 1, the better your model fits the data.
-
Formula: \(\(R^2 = 1 - \frac{\sum_{i=1}^{n} (y_{\text{true}} - y_{\text{pred}})^2}{\sum_{i=1}^{n} (y_{\text{true}} - \bar{y}_{\text{true}})^2}\)\)
-
where $$ \bar{y}_{\text{true}}$$ is the mean of the observed data.
Understanding these metrics can help you assess the performance of your regression model and make necessary adjustments to improve its accuracy.