Model Performance on Various Datasets

The results provided show the performance of several models on various datasets, with metrics such as Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Percentage Error (MAPE), and the coefficient of determination (R^2).

Here's a breakdown of what these results suggest for each model:

Naive Forecast (Baseline)

The Naive Forecast model, which predicts future values based on the most recent data point, has been evaluated across various datasets.
Here's a breakdown of its performance:

Combined_Raw.csv:
- MAE: 3.09, MSE: 22.14, RMSE: 4.71, MAPE: 6.99%, R^2: 0.975
- This dataset shows a relatively high accuracy with an R^2 of 0.975, indicating the model can explain 97.5% of the variance in the data.
Combined_CPI_Adjusted.csv:
- MAE: 1.46, MSE: 4.42, RMSE: 2.10, MAPE: 6.89%, R^2: 0.963
- The model performs well, with a high R^2 value, suggesting good predictive capability.
Combined_Log_Transformed.csv:
- MAE: 0.068, MSE: 0.009, RMSE: 0.097, MAPE: 2.36%, R^2: 0.956
- Excellent performance with low error metrics and a high R^2, indicating strong predictive accuracy.
Combined_Log_Clean.csv:
- MAE: 0.068, MSE: 0.010, RMSE: 0.101, MAPE: 2.32%, R^2: 0.951
- Similar to the log-transformed dataset, showing high accuracy and low error rates.
Combined_Log_Clean_NoNeg.csv:
- MAE: 0.087, MSE: 0.019, RMSE: 0.137, MAPE: 2.99%, R^2: 0.889
- Slightly lower performance compared to other log-transformed datasets but still shows good predictive capability.
Combined_Log_Transformed_Excl_Roil.csv:
- MAE: 1.46, MSE: 4.42, RMSE: 2.10, MAPE: 6.89%, R^2: 0.963
- Performance is identical to the CPI-adjusted dataset, indicating consistent model behavior across these datasets.
Combined_Log_Excl_Roil_Clean.csv:
- MAE: 1.55, MSE: 5.93, RMSE: 2.44, MAPE: 7.04%, R^2: 0.949
- Shows good predictive accuracy, though slightly lower than the CPI-adjusted and log-transformed datasets.
Combined_Log_Excl_Roil_Clean_NoNeg.csv
- MAE: 1.79, MSE: 8.64, RMSE: 2.94, MAPE: 8.98%, R^2: 0.883
- This dataset shows the lowest performance among the evaluated datasets, but still maintains a decent level of predictive accuracy.

'Best' Dataset:

For the Naive Forecast model, the dataset that yielded the 'best' results is the Combined_Log_Transformed.csv.

The Naive Forecast model demonstrates strong performance across various datasets, particularly with log-transformed data, where it achieves high R^2 values and low error metrics.

This suggests that the most recent data point is often a good predictor for the next, especially in datasets where logarithmic transformations have been applied, indicating a level of consistency or trend in the data.
However, it's important to note that while the Naive Forecast can be surprisingly effective, it's a simple approach that doesn't account for complex patterns or potential changes in trends.

Random Forest

Datasets with NaNs: We've chosen not to use Random Forest on the following datasets: Combined_Raw.csv, Combined_CPI_Adjusted.csv, and Combined_Log_Transformed.csv due to containing NaN values.
Random Forest cannot handle NaNs; they need to be imputed or the rows/columns with NaNs must be dropped before modeling.
Combined_Log_Clean.csv:
- MAE: 0.085, MSE: 0.016, RMSE: 0.126, MAPE: 3.01%, R^2: 0.92
- The Random Forest model shows strong performance on the cleaned log-transformed dataset. The high R^2 score indicates that the model explains a significant portion of the variance in the data.
Combined_Log_Clean_NoNeg.csv:
- MAE: 0.043, MSE: 0.003, RMSE: 0.058, MAPE: 1.55%, R^2: 0.98
- This dataset yields the best performance for Random Forest, with very low error metrics and a high R^2 score. Removing negative values seems to have a positive impact on the model's accuracy.
Combined_Log_Excl_Roil_Clean.csv:
- MAE: 1.70, MSE: 5.88, RMSE: 2.43, MAPE: 9.00%, R^2: 0.94
- While the performance is strong, the errors are higher compared to the datasets where the real oil price is included and log-transformed. This suggests that including real oil data helps improve the model's predictions.
Combined_Log_Excl_Roil_Clean_NoNeg.csv:
- MAE: 0.80, MSE: 0.95, RMSE: 0.98, MAPE: 4.25%, R^2: 0.99
- This dataset shows excellent performance with the highest R^2 score among the datasets where real oil price is excluded from log transformation. The low error metrics indicate a highly accurate model.

'Best' Dataset:

The Random Forest model performs exceptionally well on datasets that have undergone log transformation and cleaning, especially when negative values are removed.
The inclusion of real oil price data, when log-transformed, appears to significantly enhance the model's predictive capabilities.

The MAPE values are consistently low across datasets where the model is applied, indicating that the predictions are, on average, very close to the actual values.
The high R^2 scores, particularly for the Combined_Log_Clean_NoNeg.csv and Combined_Log_Excl_Roil_Clean_NoNeg.csv datasets, demonstrate the model's effectiveness in capturing the variance in the data.

These results underscore the importance of data preprocessing and feature selection in building effective predictive models. The Random Forest algorithm, in particular, benefits from cleaner data (free from NaNs and negative values) and appropriate transformations, leading to more accurate predictions.
It's crucial to ensure that these models are not overfitting the training data, which can be verified through cross-validation or by evaluating the model's performance on a separate test set.

XGBoost

The results provided showcases the performance of the XGBoost regression model across various datasets, evaluated using metrics such as Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Percentage Error (MAPE), and the coefficient of determination (R^2).
Here's an analysis of the results:

Combined_Raw.csv:
- MAE: 2.72, MSE: 14.92, RMSE: 3.86, MAPE: 6.41%, R^2: 0.98
- This dataset shows excellent performance with a very high R^2 score, indicating that the model explains a significant portion of the variance in the data. The errors are relatively low, suggesting good predictive accuracy.
Combined_CPI_Adjusted.csv:
- MAE: 1.57, MSE: 4.73, RMSE: 2.17, MAPE: 7.58%, R^2: 0.95
- The model performs well on this dataset too, with a high R^2 score. The errors are slightly higher compared to the raw dataset, which might be due to the CPI adjustment affecting the data's characteristics.
Combined_Log_Transformed.csv:
- MAE: 0.07, MSE: 0.01, RMSE: 0.10, MAPE: 2.53%, R^2: 0.94
- The logarithmic transformation seems to have significantly improved the model's performance, with very low error metrics and a high R^2 score. This indicates that the log transformation helps in linearizing the relationships in the data.
Combined_Log_Clean.csv:
- MAE: 0.08, MSE: 0.01, RMSE: 0.12, MAPE: 2.88%, R^2: 0.93
- Cleaning the data post-log transformation maintains strong model performance, although there's a slight increase in errors and a minor decrease in R^2 compared to the non-cleaned log-transformed data.
Combined_Log_Clean_NoNeg.csv:
- MAE: 0.05, MSE: 0.004, RMSE: 0.07, MAPE: 1.81%, R^2: 0.97
- Removing negative values further enhances model performance, resulting in the lowest errors and a very high R^2 score among all datasets. This suggests that negative values might have been outliers or noise in the data.
Combined_Log_Transformed_Excl_Roil.csv:
- MAE: 1.48, MSE: 4.12, RMSE: 2.03, MAPE: 7.10%, R^2: 0.96
- Excluding the real oil price from the log transformation results in increased errors and a slightly lower R^2 score compared to the full log-transformed dataset. This indicates that the real oil price contributes significantly to the model's predictive power.
Combined_Log_Excl_Roil_Clean.csv:
- MAE: 1.60, MSE: 4.12, RMSE: 2.03, MAPE: 8.01%, R^2: 0.96
- Cleaning the dataset while excluding real oil price from log transformation shows similar performance to the non-cleaned version, with a slight increase in MAPE.
Combined_Log_Excl_Roil_Clean_NoNeg.csv:
- MAE: 1.14, MSE: 2.04, RMSE: 1.43, MAPE: 6.18%, R^2: 0.97
- This dataset, after excluding the real oil price from the log transformation and cleaning negative values, shows impressive performance. The R^2 score of 0.97 indicates that the model explains a vast majority of the variance in the dataset. The errors (MAE, MSE, RMSE) are relatively low, and the MAPE of 6.18% suggests that the predictions are, on average, within about 6.18% of the actual values, which is quite accurate for forecasting tasks.

'Best' Dataset:

The Combined_Log_Clean_NoNeg.csv dataset generally shows the best performance across most metrics, highlighting the effectiveness of log transformation and cleaning negative values in the data.

The inclusion of the real oil price in the log transformation appears to be crucial for model performance, as seen in the datasets where it's excluded.

Polynomial Regression

Datasets with NaNs: We've chosen not to use Polynomial Regression on the following datasets: Combined_Raw.csv, Combined_CPI_Adjusted.csv, and Combined_Log_Transformed.csv due to containing NaN values.
Polynomial Regression cannot handle NaNs; they need to be imputed or the rows/columns with NaNs must be dropped before modeling.
Combined_Log_Clean.csv:
- MAE: 0.10, MSE: 0.017, RMSE: 0.129, MAPE: 3.44%, R^2: 0.92
- This dataset, with log transformation and cleaning, shows strong performance in the polynomial regression model. The R^2 score of 0.92 is quite high, indicating that the model explains a significant portion of the variance in the data. The error metrics are low, and a MAPE of 3.44% suggests good accuracy.
Combined_Log_Clean_NoNeg.csv:
- MAE: 0.094, MSE: 0.014, RMSE: 0.119, MAPE: 3.30%, R^2: 0.92
- Similar to the previous dataset, this one also shows strong performance. The removal of negative values seems to have a slightly positive impact on the model's accuracy, as indicated by the marginally lower error metrics and MAPE.
Combined_Log_Excl_Roil_Clean.csv:
- MAE: 2.43, MSE: 11.78, RMSE: 3.43, MAPE: 14.26%, R^2: 0.89
- In this dataset, where the real oil price is included but not log-transformed, the model's performance drops compared to the previous two datasets. The errors are significantly higher, and the R^2 score, while still good, is lower. This suggests that the log transformation of the real oil price is beneficial for the model's performance.
Combined_Log_Excl_Roil_Clean_NoNeg.csv:
- MAE: 1.73, MSE: 5.03, RMSE: 2.24, MAPE: 10.01%, R^2: 0.93
- This dataset, which excludes the real oil price from log transformation and cleans negative values, shows improved performance compared to the previous dataset but is still not as strong as when the real oil price is log-transformed. The R^2 score is high, but the error metrics, especially MAPE, are higher than in the first two datasets.

'Best' Dataset:

The Combined_Log_Clean.csv and Combined_Log_Clean_NoNeg.csv datasets yield the best results with the Polynomial Regression model, indicating that log transformation and cleaning of the data are crucial for model performance.

Excluding the real oil price from log transformation (while still including it in the dataset) leads to a decrease in model performance, suggesting that the way this feature is processed has a significant impact. The relatively high R^2 scores across all datasets indicate that Polynomial Regression is a suitable model for this type of data, but the choice of data preprocessing steps (like log transformation and cleaning of negatives) plays a crucial role in achieving optimal performance.

As with any model, it's important to validate these results on unseen data to ensure that the model generalizes well and is not overfitting.

ARIMA

Combined_Raw.csv:
- MAE: 16.80, MSE: 522.13, RMSE: 22.85, MAPE: 25.48%, R^2: -0.49
- The ARIMA model performs poorly on the raw dataset, as indicated by high error metrics and a negative R^2 score. This suggests that the model is not suitable for the raw data without preprocessing.
Combined_CPI_Adjusted.csv:
- MAE: 4.47, MSE: 36.84, RMSE: 6.07, MAPE: 20.41%, R^2: -0.10
- Although the error metrics are lower compared to the raw dataset, the negative R^2 score still indicates poor model performance. This suggests that CPI adjustment alone is not sufficient for the ARIMA model to perform well.
Combined_Log_Transformed.csv:
- MAE: 0.21, MSE: 0.081, RMSE: 0.285, MAPE: 7.06%, R^2: -0.05
- The log transformation improves the model's performance significantly, as seen in the lower error metrics. However, the slightly negative R^2 score suggests that the model still does not fit the data well.
Combined_Log_Clean.csv:
- MAE: 0.28, MSE: 0.106, RMSE: 0.326, MAPE: 8.76%, R^2: -0.58
- Despite cleaning the data, the ARIMA model's performance does not improve significantly. The negative R^2 score indicates a poor fit to the data.
Combined_Log_Clean_NoNeg.csv:
- MAE: 0.23, MSE: 0.112, RMSE: 0.334, MAPE: 7.62%, R^2: -0.22
- Removing negative values does not lead to a substantial improvement in model performance. The negative R^2 score persists, indicating a poor fit.
Combined_Log_Transformed_Excl_Roil.csv:
- MAE: 4.47, MSE: 36.84, RMSE: 6.07, MAPE: 20.41%, R^2: -0.10
- Similar to the Combined_CPI_Adjusted.csv dataset, the performance here is poor, with a negative R^2 score, suggesting that excluding real oil price from log transformation does not benefit the ARIMA model.
Combined_Log_Excl_Roil_Clean.csv:
- MAE: 6.27, MSE: 58.57, RMSE: 7.65, MAPE: 24.66%, R^2: -0.68
- This dataset shows one of the worst performances, with high error metrics and a significantly negative R^2 score, indicating a very poor fit.
Combined_Log_Excl_Roil_Clean_NoNeg.csv:
- MAE: 6.19, MSE: 69.51, RMSE: 8.34, MAPE: 29.55%, R^2: -0.17
- Despite cleaning and removing negative values, the performance remains poor, as indicated by high error metrics and a negative R^2 score.

'Best' Dataset:

The ARIMA model generally performs poorly across all datasets, with negative R^2 scores in most cases, indicating that it is not a suitable model for this type of data.

The log-transformed datasets show relatively better performance compared to others, but the improvement is not enough to make the ARIMA model competitive with other models like XGBoost or Polynomial Regression.

The consistently negative R^2 scores across different datasets suggest that ARIMA struggles to capture the underlying patterns in this data, possibly due to its non-stationary nature or complex relationships that ARIMA cannot model effectively.
These results highlight the importance of choosing the right model for the data at hand and the limitations of ARIMA in handling complex datasets like these.

Why ARIMA may not be suitable for this data?

ARIMA (AutoRegressive Integrated Moving Average) models might not perform well on certain datasets due to a few key reasons:

Non-Stationary Data: ARIMA models are designed for stationary time series, where the statistical properties like mean and variance are constant over time. If the data shows trends or seasonality, it needs to be transformed into a stationary form before using ARIMA, which might not always be effective.
Complex Relationships: ARIMA models may struggle with data that has complex, non-linear relationships. They are fundamentally linear models and might not capture complex patterns present in some datasets, especially those influenced by numerous external factors.
Lack of External Factors: ARIMA models primarily focus on the time series data itself and do not incorporate external variables or factors. If the dataset is influenced by external factors (like economic indicators, global events, etc.), ARIMA might not be able to account for these influences.
Overfitting on Noisy Data: ARIMA can overfit on noisy data, leading to a model that performs well on training data but poorly on unseen data. This is particularly problematic if the noise in the data is random and does not contain information about future values.
Parameter Selection: Choosing the right parameters (p, d, q) for ARIMA models can be challenging. Incorrect parameter choices can lead to poor model performance.

In summary, ARIMA's limitations in handling non-stationary data, complex relationships, external factors, and its sensitivity to parameter selection and noisy data can contribute to its suboptimal performance on certain datasets.

Prophet

Combined_Raw.csv: This dataset contains the raw data without any preprocessing. The model's performance is moderate with an R^2 of 0.72. However, the error metrics like MAE and RMSE are relatively high, suggesting that the raw data might have outliers or noise affecting the model's predictions.
Combined_CPI_Adjusted.csv: Adjusting for CPI helps in normalizing the data related to inflation or other economic changes over time. The model shows a decent fit with an R^2 of 0.66, but the error metrics are still on the higher side.
Combined_Log_Transformed.csv: Log transformation helps in stabilizing the variance and making the data more 'model-friendly'. This dataset shows significant improvement in all metrics, especially in reducing the MAPE to 6.19% and achieving an R^2 of 0.73.
Combined_Log_Clean.csv: Further cleaning the log-transformed data results in slightly better performance, indicating that removing anomalies or irrelevant data points can enhance model accuracy.
Combined_Log_Clean_NoNeg.csv: This dataset yields the best performance across all metrics, with the lowest errors and the highest R^2 of 0.77. Removing negative values seems to have a substantial positive impact, suggesting that such values might have been outliers or noise.
Combined_Log_Transformed_Excl_Roil.csv: Excluding real oil data in the log-transformed dataset shows a decrease in model performance compared to when real oil data is included, indicating that real oil prices might be a significant predictor for the model.
Combined_Log_Excl_Roil_Clean.csv and Combined_Log_Excl_Roil_Clean_NoNeg.csv: These datasets, which exclude real oil data and involve cleaning, show lower performance compared to datasets where real oil data is included. This further emphasizes the importance of real oil prices in the prediction model.

'Best' Dataset:

The Combined_Log_Clean_NoNeg.csv dataset stands out as the 'best' performer for the Prophet model. The combination of log transformation, data cleaning, and removal of negative values significantly enhances the model's predictive accuracy. The low MAPE of 5.71% and the highest R^2 of 0.77 suggest that the model's predictions are both accurate and reliable for this dataset.

While the Combined_Log_Clean_NoNeg.csv dataset shows the best performance, it's essential to ensure the model's ability to generalize. This can be done by evaluating the model on a separate test set or using time-series cross-validation techniques.